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1 . I am a co-inventor of the subject matter disclosed and claimed in the 
above-captioned patent application. 

Scytovirin Fragments and Variants 

2. The amino acid sequence of wild-type scytovirin is described in the 
present application (i.e., SEQ ID NO: 1, which is 95 amino acids in length). 

3. A DNA construct encoding amino acids 1-48 of scytovirin (SDl), 
wherein the cysteine at residue 7 is substituted with serine, was constructed, 
expressed, and isolated. An XTT-tetrazolium based assay was used to determine the 
anti-HIV activity of SDl and scytovirin against acute HIV-l infection in CEM-SS 
cells. The protocol followed was the same as described in the present application 
(see, e.g., Example 5). 

4. SDl and scytovirin showed comparable activity against the T-tropic 
laboratory strain HIV-Irf in CEM-SS cells with EC50 values of 6.6 nM and 7.5 nM, 
respectively. Toxicity to the CEM-SS cell line was not detected for either scytovirin 
or SDl at concentrations up to 10,000 nM. Thus, the SDl fragment of scytovirin 
retained scytovirin' s potent anti-viral (e.g., anti-HIV) activity. 
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5. Additional experiments were performed to determine whether the N- 
and C-terminus of SDl are necessary for antiviral activity. In particular, a series of 
deletion mutants were constructed in which two, five, or ten amino acids were deleted 
from the N-terminus of SDl, or eight amino acids were deleted from the C-teraiinus 
of SDl as described in Xiong et aL, Peptides, 27: 1668-1675 (2006) (Attachment A). 
The deletion mutants were subjected to the XTT-tetrazolium based assay. 

6. Deletion of two, five, or ten N-terminal amino acids from SDl 
completely eliminated antiviral activity, indicating that N-terminal amino acids of 
SDl are necessary for maintaining the antiviral activity of SDl . Deletion of the eight 
amino C-terminal amino acids resulted in a 3- to 7-fold decrease in anti-HIV potency, 
indicating that these C-terminal amino acids optimize the activity of SDl, but are not 
necessary in order to retain some anti-viral activity. 

7. In addition to the scytovirin fragments and variants described above, it 
is possible to generate proteins comprising several mutations in the scytovirin amino 
acid sequence (e.g., proteins comprising 90% identity to the amino acid sequence set 
forth in SEQ ID NO: 1), which retain antiviral activity (see paragraphs [0025]-[0026] 
of the specification). Moreover, one of ordinary skill in the art has the ability to 
determine wliich amino acids of SEQ ID NO: 1 to alter to generate mutant scytovirins 
having antiviral activity using routine laboratory teclmiques. 

8. To identify amino acid residues appropriate for manipulation to 
generate a fimctional scytovirin protein, the ordinarily sldlled ailisan can determine 
the three-dimensional structure of the scytovirin protein from SEQ ID NO: 1. Indeed, 
the thi-ee-dimensional structure of scytovirin is described in McFeeters et aL, J. MoL, 
Biol, 369: 45 1 -46 1 (Attachment B). Ideally, mutations that do not modify the 
electronic or structural environment of the protein are generated to retain optimal 
antiviral activity. By utilizing information regarding the three-dimensional structure 
of the protein and the amino acid sequence of SEQ ID NO: 1, determination of which 
amino acids are critical for proper protein folding by way of their location within the 
protein structure or interaction with sim"ounding residues is within the skill of the 
ordinary researcher. 

9. In addition to creating mutations in regions of the amino acid sequence 
of SEQ ID NO: 1 that are not critical for three-dimensional structure, the ordinary 
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researcher can determine which amino acid residues are lilcely responsible for viral 
binding. It is understood in the art that surface hydrophobicity plays a key role in 
protein-protein interactions, such as the interaction between scytovirin and viral 
proteins. The ordinary researcher has the ability to map hydrophobic surface clusters 
on the scytovirin protein or scytovirin homologs to predict regions critical for 
interaction with the viral envelope using routine methods such as those disclosed in 
McFeeters et aL, supra\ Botos et al., J. Biol Chem., 277(13): 34336-34342 (2002) 
(Attachment C); and Bewley et al.. Nature Structural Biology, 5(7): 571-578 (1998) 
(Attachment D). Amino acid residues not found in hydrophobic surface clusters are 
lilcely not critical for hydrophobicity of these clusters and, thus, are appropriate targets 
for mutation to create scytovirin variants (e.g., with 90% identity to SEQ ID NO: 1), 
which retain antiviral activity. 

10. With respect to screening for antiviral activity, the assays described in 
the present application (see, e.g.. Example 5) are well within the skill of the ordinary 
researcher and require only routine laboratory techniques. 

Scytovirin Antiviral Activity 

11. As described in the specification, scytovirin, as well as variants and 
fragments thereof, have potent antiviral activity. In addition to anti-HIV activity, 
scytovirin has potent anti-viral activity against Ebola and influenza viruses. 

12. To demonstrate scytovirin's activity against Ebola virus, VeroE6 cells 
plated in 96 well plates were infected with recombinant Ebola (Zaire) viras expressing 
green fluorescent protein (GFP). Various concentrations of scytovirin and cyanovirin 
were introduced to the cells before Ebola virus infection. After incubation for 48 
hours, GFP expression was detected by fluorescence. 

13. Scytovirin demonstrated potent antiviral activity with an EC50 value of 
about 0.4 |ag/ml, while cyanovirin had an EC50 value of 3.5 \xg/ml. When greater than 
about 1 txg/|il scytovirin was added to the cells before Ebola virus infection, almost no 
GFP expression was detected, indicating scytovirin's inhibition of Ebola virus 
mfection. In comparison, even at concentrations of about 100 |ig/|a.l cyanovirin, 
fluorescence was detected, indicating infection of the cells by Ebola vims. 

14. To flirther demonstrate scytovirin's activity against Ebola virus, 20 g 
female Balb/C mice were challenged with 1000 pfu/mouse of mouse-adapted Ebola 
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(Zaire) virus. Twenty-four hours prior to viral challenge, the mice began receiving 
twice-daily (Q12) 10 mg/kg subcutaneous injections of scytovirin in phosphate 
buffered saline (PBS). 

15. Due to the highly infectious nature of Ebola virus, the ability of 
scytovirin to inhibit viral infection in mice at risk thereof was evaluated by prolonged 
survival of treated mice compared to untreated control mice. By Day 8, only 20% of 
the control mice survived. In contrast, 60% of the scytovrrin-treated mice survived. 
Indeed, 60% of the scytovirin-treated animals survived until Day 20, the last 
timepoint of the study. 

16. Thus, scytovirin has potent antiviral activity against Ebola virus both 
in vitro and in vivo. 

1 7. The anti-influenza virus activity of scytovirin was tested in the 
following manner. Host cells used for these assays were obtained at the Southem 
Research Institute-Frederick (SoRI-Frederick, Frederick, Maryland). MDCK cells 
were used for all assays. The humanized H5N1 influenza vims strains that were used 
in the assays included Hong Kong/491/1997 and Vietnam/1203/2004H. The avian 
H5N1 influenza virus strains that were used in the assays included Duck/MN/1 525/81 
and Gull/PA/4 175/83. Additionally, humanized influenza virus strains 
B/Shanghai/362/02, A/New Caledonia/20/90 (HlNl), and A/Califomia/7/04 (H3N2) 
were used. 

18. A typical antiviral assay for each vims was as follows. A pretreated 
aUquot of virus was removed from the freezer (-80° C) and allowed to thaw. The 
viras was diluted into tissue culture mediimi such that the amount of vims added to 
each well in a volume of 100 |li1 was that amount pre-determined to yield complete 
cell killing at 3-7 days (depending on the vims) post-infection. Scytovirin (SEQ JD 
NO: 1) stock solution was diluted into the medium to the desired high concentration 
and then further diluted in the medium in 0.5 loglO increments. 

1 9. The day following plating of cells, plates were removed from the 
incubator, and medium was removed and discarded. The infection mediimi was 
DMEM (Dulbecco's Minimal Essential Medium) supplemented with 0.3% BSA, 100 
U/ml penicillin, 100 fig/ml streptomycin, 0.1 mM non-essential amino acids, 0.1 mM 
sodium pyruvate, 2 mM L-glutamine, and 2.0 |ig/ml TPCK-treated (N-tosyl-L- 
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phenylalanine ketone) trypsin. Drug dilutions (6-12 dilutions) were made in medium 
and added to appropriate wells of a 96-well microtiter plate in a volxmie of 100 jil per 
well. Each dilution was set up in triplicate. Infection medium containing 
appropriately diluted virus was added to appropriate wells of the microtiter plate. 
Each plate contained cell control wells (cells only), virus control wells (cells plus 
virus), drug toxicity control wells (cells plus drug only), drug colorimetric control 
wells (drug only), as well as experimental wells (drug plus cells plus virus). 

20. After 7 days of incubation at 37° C in a 5% CO2 incubator, the test 
plates were analyzed by staining with CellTiter 96™ AQueous One Solution Reagent 
(Promega, Madison, WI). Twenty microliters of the reagent were added to each well 
of the plate, and the plate was reiiicubated for 4 hrs at 37"^ C. CellTiter 96™ AQueous 
One Solution Reagent contains a tetrazoliiun (MTS) compound and the electron- 
coupling reagent phenazine ethosulfate (PES). MTS-tetrazolium is bioreduced by 
cells into a colored formazan product that is soluble in tissue culture medium. This 
conversion is mediated by NADPH and NADH produced by dehydrogenase enzymes 
in metabolically active cells. The amount of soluble formazan produced by cellular 
reduction of the MTS was determined by measuring the absqrbance of li^t at a 
wavelength of 490 nm. The quantity of formazan product as determined by the 
amount of 490 nm wavelength light absorbance was directly proportional to the 
nimiber of living cells in the culture, thus allowing the rapid quantitative analysis of 
the inhibition of virus-induced cell killing by the test substances. 

21 . The IC50 (50% inhibition of vims replication), EC50 (50% reduction in 
cell viability), EC90 (90% reduction in cell viability), and the corresponding antivkal 
index (IC50/EC50) were calculated using standard procedures. Results from testing of 
scytovirin (SEQ ID NO: 1) against different strains of influenza vims are shown in the 
tables below. 
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Table i: Anii->H5N1 influenza Vims Activity of Soytovirin 



Virus Strain 


Assay 


(llg/ml) 


EC90 




Antiviral 
Index 


Duck/MN/1525/61 


Visual 


20 




>100 


>5 


Virus Yield 




>100 


>100 




GU11/PA/4175/S3 


Visual 


20 




>100 


>5 


Vii-us Yield 




33 


>I00 


>3 


Hong Kong/491/1997 


Visual 


0.4 




>100 


>250 


Virus Yield 




>100 


>100 




ViolnQ.m/1203/2004IJ 


Visual 


4 




>100 


>25 


Virus Yield 




2 


>ioo 


>50 



Tabic 2: Anti Influenza Virus Activity of Soytovirin 



Virus Strain 


ECso 
(lifi/nii) 




B/Shanghai/362/02 


0.18 


>10 


A/JNew Caledonia/20/99 (HlNl) 


0,14 


>iO 


A/CaHibnii«i/7/04 (H3N2) 


0.18 


>10 



22. As illustrated by the data set forth in the tabic, iicytovirin is active 
against all tested strains or humanized influenza virus as demonstrated by an EC50 
value of if3ss than 10 j.ig/ml. Accordingiy, soytovirin is effective for inhibiting an 
influenza viral infection of a host, and in particular humanized influenza viral 
infections. 

23- I hereby declare that all statements made herein of my own knowledge 
are true, that ail stotemcnU made on information and belief are believed to be true, 
that these siatemenns were made with the knowledge that wiilfttl i&lse statements and 
the like so made are punishable by fine or imprisonment or both, under Section 100 1 
of Title 18 of the United States Code, and tiiat such willfjil false statements may 
jeopardize the validity of the application or any patent issued thereon, 
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ABSTRACT 



Scytovirin (SVN) is a novel anti-HIV protein isolated from aqueous extracts of the cultured 
cyanobacterium Scytonema varium, SVN contains two apparent domains, one comprising 
amino acids 1-48 and the second stretching from amino adds 49 to 95. These two domains 
display significant homology to each other and a similar pattern of disulfide bonds. Two 
DNA constructs encoding scytovirin 1-48 (Cys7Ser) (SDl) and 49-95 (Cys55Ser) (SD2) were 
constructed, and expressed in E. coli, with thioredoxin fused to their N~terminus. Purified 
recombinant products were tested for binding activities with the HIV surface envelope 
glycoproteins gpl20 and gp41. Whole cell anti-HIV data showed that SDl had similar anti- 
HIV activity to the full-length SVN, whereas SD2 had significantly less anti-HIV activity. 
Further deletion mutants of the SDl domain (SVN(3-45)Cys7Ser, SVN(6-45)Cys7Ser, SVN(11- 
45)Cys7Ser) showed that the N-terminal residues are necessary for full anti-HIV activity of 
SDl and that an eight amino add deletion from the C-terminus (SVN(l-40)Cys7Ser) had a 
significant effect, decreasing the anti-HIV activity of SDl by approximately five-fold. 

Published by Elsevier Inc. 



1. Introduction 

Viral infections remain among the most fonnidable causes of 
human and non-human animal morbidity and mortality 
worldwide. Effective prevention and therapies against most 
viral pathogens remain elusive. One of the most recent and 
catastrophic examples is the rapidly expanding and pervasive 
worldwide pandemic of human immunodeficiency virus (HIV) 
infection and acquired irrunune defidency syndrome (AIDS). 
The need for effective preventative and therapeutic agents for 
HIV/AIDS and other potentially lethal viral diseases remains 
an urgent global priority. 

Antimicrobial peptides and proteins are the essential 
molecular effectors of iimate immunity in human and other 
vertebrates to combat microbial challenge [7,8]. Peptides offer 
tremendous structural diversity that can be exploited for the 
development of novel therapeutics and prophylactics for 



* Corresponding author. Tel.: +1 301 846 5332; fax: +1 301 846 6177. 
E-mail address: okeefe@ncifcrf.gov (B.R. O'Keefe). 
0196-9781/$ - see front matter. Published by Elsevier Inc. 
doi:10.1016/j.peptides.2CK}6.03,018 



many different diseases. For example, in the field of HIV 
therapeutics a novel, rationally constructed peptide molecule 
known as T-20 [12] has been shown to be a potent inhibitor of 
HIV/cell fusion. Furthermore, naturally occurring, non-mam- 
malian peptides and proteins offer new avenues for antiviral 
discovery and development [4,5,17]. 

Scjrtovirin (SVN), a potent anti-HIV protein, was originally 
isolated from aqueous extracts of the cyanobacterium 
Scytonema varium [3], SVN, with a molecular mass of 
9713 Da, contains five intrachain disulfide bonds and binds 
to the HIV-1 envelope glycoproteins gpl20, gpl60 and gp41, but 
not to the cellular receptor CD4 [3]. Low nanomolar concen- 
trations of SVN inactivate both laboratory strains and primary 
isolates of HIV-1. This inhibition has been shown to involve 
selective interactions between SVN and high-mannose oligo- 
saccharides [3]. Specifically, SVN interacts with oligosacchar- 
ides containing al-2, al-2, al~6 tetramannoside imits [2]. 
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1 GSGPTrcWEANNPGGPNR(^^ 48 
49 PKGPTYCWDEAKNPGGPNRCSNSKQCDGARTCSSSGFCQGTAGHAAA 95 



Fig. 1 ~ Sequence alignment between the two domains of SVN. Amino acids are represented by single-letter codes. Identical 
sites are represented by Conserved sites are represented by Disulfide bonds are marked with solid lines above the 
sequence. 



SVN shows strong internal sequence duplication. When 
amino acids 1-48 and 49-95 are aligned, 36 residues (75%) 
are identical and three (6%) represent conservative amino 
acid changes (Fig. 1). Furthermore, NMR analysis of SVN 
revealed that the protein has two domains with strong 
symmetry [Drs. McFeeters and Byrd, personal communica- 
tion]. Based on the above facts, DNA constructs for SVN 
single domains [amino acids 1-48 (SDl) and 49-95 (SD2)] 
were produced and expressed to explore functional domains 
and sequence requirements for gpl20 (or gp41) binding and 
anti-HIV activity of SVN. Furthermore, additional truncation 
mutants of SDl were expressed and their anti-HIV activity 
assessed. 



2. Materials and metiiods 

2.1. Materials 

Restriction endonucieases and T4 DNA ligase were obtained 
from New England Biolabs (Beverly, MA). Pfu DNA polymerase 
was obtained from Stratagene (La Jolla, CA). Rabbit anti-SVN 
polyclonal antibodies were produced as described previously 
[3]. Recombinant enterokinase was from Novagen (Madison, 
WI). Recombinant glycosylated gpl20 was obtained from 
Intracel Corporation (Cambridge, MA). Glycosylated gp41 
(HrV-1 gp41hp) was purchased from Viral Therapeutics, Inc. 
(Ithaca, NY). Reagents for sodium dodecyl sulfate/polyacryla- 
mide gel electrophoresis (SDS-PAGE) were obtained from 
Invitrogen (Carlsbad, CA). All other chemicals were analytical 
reagent grade from Sigma (St. Louis, MO). The plasmid vector 
pET32C(+) and E. coli strain Origami B(DE3)pLysS were from 
Novagen (Madison, WI). 

2.2. DNA amplijiccition 

For PGR amplification, pET(SVN) plasmid [21] was used as a 
template. Primers synthesized by Integrated DNA Technolo- 
gies, Inc. (Coralville, lA) are as follows: 

SDl (SVN{l-48)Cys7Ser): 

Forward primer: 5'-CATGCCATGGCTGGTTCTGGTCCGACC- 
TACTCTTGGAACG-3' 

Reverse primer: 5'-GCGCTGGAGTTACCCCGGGTCCGGTT- 
TACGAGA-3' 

SD2 (SVN(49-95)Cys55Ser): 

Forward primer: 5'-CATGCCATGGCTGCCAAAGGTCCGAC- 
CTACTCTTGGGACGAG-3' 

Reverse primer: 5'-CCCGCTCGAGTTACGCAGCCGCGTGAC- 
CGGCGG-3' 



SVN(3-45)Cys7Ser: 

Forward primer: S'-CATGCCATGGCTGGTCCGACCTACTC- 

TTGGAAC-3' 

Reverse primer: 5'-GCGCTCGAGTTACGGTTTACGAGAGG- 

TACGCTGGC-3' 
SVN(6-45)Cys7Ser: 

Forward primer: S'-CATGCCATGGCITACTCTTGGAAGGA- 
AGCGAAG-3' 

Reverse primer: 5'-GCGCTCGAGTTAGGGTTTACGAGAGG- 
TACCCTGGC-y 

SVN(ll-45)Cys7Ser: 

Forward primer: 5'-CATGCCATGGCTGCGAAGAACCCGGG- 
TGGTCCGAAC-3' 

Reverse primer: 5'-GCGCTCGAGTTACGGTTTACGAGAGG- 

TACCCTGGC-3' 

SVN(l-40)Cys7Ser: 

Forward primer: 5'-CATGCCATGGCTGGTTCTGGTCCGACC- 

TACTCTTGGAACG-3' 

Reverse primer: S'-GGTTTCTGCCAGGGTTAATCTCGTAA- 
ACCGGAC-3' 

The above PGR fragments were digested with Ncol-Xhol, 
respectively, and inserted into a NcoI-XhoI-digested pET32C(+) 
vector. The linker sequence between the enterokinase diges- 
tion site and SVN DNA was deleted from the construct by site- 
directed mutagenesis (GeneTailor Site-Directed Mutagenesis 
System, Invitrogen) and the resulting plasmid vector was tr- 
ansformed into E. coli strain DH5a. The sequence of the gene 
was confirmed by DNA sequencing. 

2.3. Expression of SVN fusion protein in E. coli 

The above plasmids expressing various SVN mutants under 
the control of a T7 promoter were transformed into E. coli 
Origami B(DE3)pLysS. The cells containing the constructs 
were grown at 37 °C in LB medium (Luria-Bertani broth) 
containing 100 jjug^mL ampicillin, 34 y,g/mL chloramphenicol 
and 15 pi^mL kanamycin to an ODgoo of 0.5-0.8. The cells 
were induced with isopropyl-l-thio-p-o-galactopyranoside 
(IPTG) to a final concentration of 1 mM, then incubated for 
an additional 4 h and collected by centrifugation. The cells 
were either prepared immediately for SDS-PAGE assay or 
frozen at -80 °C, 

2 A, Cell/ractionation 

Frozen cells were thawed in 50 mM phosphate buffer contain- 
ing 0.4mg^mL lysozyme and ImM phenylmethylsulfonyl- 
fluoride (PMSF). Efficient lysis occurred over several minutes. 
DNase I (20 p,g/mL) and MgCl2 (10 mM) were then added to 
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digest DNA. Crude lysates were next centrifuged to separate 
soluble and insoluble protein fractions. 

2.5. Peptide purification 

The expressed peptides were initially purified by histidine tag 
metal affinity chromatography (BD) according to the manu- 
facturer's protocol. Briefly, 25 mL cleared lysate from 1 L of LB 
cultured bacterial pellet was incubated with 2 mL affinity resin 
that had been equilibrated with 10 column volumes of buffer A 
(50 mM phosphate buffer, pH7.5, 300 mM NaGl, 5 mM imida- 
zole, 1 mM PMSF). The protein-bound resin was then washed 
with 10 column volumes of buffer A, SVN mutant fusion 
proteins were then eluted with 10 colunm volumes of 150 mM 
imidazole in buffer A. 

The resulting samples were filtered through an Amicon 
Centriprep 3 kDa molecular mass cutoff filter to a volume of 
2 mL to desalt and concentrate the material. They were then 
digested for 16 h at room temperature with recombinant 
enterokinase (rEK, Novagen), at 1 unit of enterokinase/10 p.g 
of recombinant protein. rEK-cleaved SVN mutants were then 
purified using reversed-phase HPLC. The peptides were 
loaded on a Dynamax G18 300 A column and eluted with a 
gradient of 0 to 60% acetonitrile in 0.05% aqueous TFA over 
60 mdn at a flow rate of 3 mL/min with UV monitoring at 210 
and 280 nm. 

2.6. Mass spectroscopic analysis 

The purified recombinant SVN mutants were analyzed by LC~ 
MS using a Agilent high-performance liquid chromatography/ 
electrospr ay ionization quadmpole mass spectrometer (model 
HOOD) using a G8 Zorbax column (2.1 mm x 110 mm) eluted 
with a linear gradient from 0 to 100% acetonitrile over 35 min 
in H2O with 5% (v/v) acetic acid in the mobile phase at a flow 
rate of 0.2 mL/min and with UV monitoring at 280 nm. After 
mass spectral deconvolution according to manufacturer's 
protocols, the masses of the various SVN mutants were 
calculated. 

2.7. Immunohloting 

Peptide samples were separated by Bis-Trls SDS/PAGE using 
a 10% (w/v) polyacrylamide gel and electroblotted on to a 
PVDF membrane. For Coomassie staining 3 jjug/well of SV-N 
and 700 ng/well of SDl or SD2 were electrophoresed. For 
Western blotting, 500 ng/well SVN and 200 ng/well of both 
SDl and SD2 were used. After blocking in lOmMTris/HCl, pH 
8.0, containing 150 mM NaCl, 0.05% Tween 20 and 5% (w/v) 
non-fat milk for 1 h at room temperature, the membrane 
was incubated with rabbit anti-SVN polyclonal antibody at 
1:5000 dilution overnight at 4 °C. The immune complexes on 
the membrane were then reacted with horseradish perox- 
idase-conjugated goat anti-rabbit IgG at 1:2000 dilution for 
1 h at room temperature. Immunodetection was achieved 
by enhanced chemiluminescence (ECL) according to the 
manufacturer's protocols. Densitometric scanning of ECL- 
blots was performed on a Molecular Dynamics 300S 
computing densitometer (Sunnyvale, CA) using ImageQuant 
V3.0 software. 



2.S. Disulfide bond determination 

The presence and connectivity of disulfide bonds were 
determined as previously described [3]. Briefly, a 100 M-g 
sample of recombinant, non-reduced SDl was added to 
60 (xL of 100 mM ammonium bicarbonate (pH 8.0), 6 {xL of 
acetonitrile and 0,6 fxL of a 40 |jlM solution of trypsin in H2O. 
The mixture was incubated at 37 ''C for 16 h and then analyzed 
by LG-MS as above. The resulting peptides were evaluated 
using peptide recognition software to detect all possible 
disulfide-linked peptide fragments [1]. 

2.9. Anti-HIV assays 

An XTT-tetrazolium-based assay was used to determine the 
anti-HR/^ activity of SVN derived peptides on acute HIV-1 
infection in CEM-SS cells as previously described [9,16]. Eight 
serial 1/2 log dilutions of SVN mutants in complete medium 
were performed and the peptide solutions were then added to 
exponentially growing CEM-SS human lymphocyte cells. Ceil 
cultures were infected with freshly thawed solutions of HIV- 
Irf and allowed to incubate for 7 days. Metabolic reduction of 
the tetrazolium salt, XTT, to a colored formazan product was 
used to determine cellular viability at the end of the 7-day 
incubation period. 

2.10. ELISA protocols 

To determine the binding of SVN derived mutants to 
glycosylated gpl20 and gp41, the envelope glycoproteins 
were evaluated by ELISA as previously described [10,18]. 
Briefly, 100 ng/well of either glycoprotein was bound to a 96- 
well plate, which was then rinsed three times with PBS 
containing 0.05% Tween 20 (TPBS), and blocked with BSA. 
Between subsequent steps, the plate was again rinsed with 
TPBS (x3). The wells were incubated with serial dilutions of 
each mutant peptide, followed by incubation with a 1:1000 
dilution of an anti-SVN rabbit polyclonal antibody [3]. The 
amount of bound SVN mutant protein was determined by 
adding a 1:2000 dilution of donkey anti-rabbit antibody 
conjugated to horseradish peroxidase. After addition of the 
horseradish peroxidase substrate buffer and color forma- 
tion, the reaction was stopped by the addition of 50 jjiL/well 
of 2 M H2SO4 (after 5 min. for gpl20 plate and 15 min. for the 
gp41 plate) and absorbance was measured at 450 nm for 
each well. 

Additional ELISA assays were performed to test specific 
monosaccharides and oligomannose 8 (Man-8) for their 
ability to inhibit SVN-derived mutants from binding to 
gpl20 and gp41. In brief, the plate was prepared as above 
with glycosylated gpl20 or gp41, 100 ng/well of SVN or 
SNV-derived peptides were added in the presence or 
absence of increasing concentrations of Man-8 or 100 
mM concentrations of the following sugars: glucose, 
mannose, galactose, xylose, N-acytylglucosamine or 
100|jug/well a-acid glycoprotein (to act as a carrier for 
sialic acid). The plate was then washed and visualized 
using anti-SVN polyclonal antibodies as above. All data 
points are averages of triplicate measurements at each 
concentration. 
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3. Results 

3.1. Purification 0/ recombinant SVN mutants 

Since SVN has a disulfide bond linkage between Cys7 and 
Cys55 which effectively links the first and second domains, the 
single donnain SDl and SD2 peptides required mutation of 
Cys7 and Cys55 to another amino acid to avoid possible 
dimerization and interference with the formation of the 
remaining disulfide bonds. For this work, both Gys7 and Cys55 
were mutated to serine (Fig, 2). 

The pET32c(+) vector is designed for cloning and high- 
level expression of protein sequences fused with the 109aa 
TRX Tag thioredoxin protein. The thioredoxin expression 
system was used to enhance disulfide bond formation. A 
Cloning site containing a cleavable His-Tag sequence was 
used for easier for detection and purification. The enter- 
okinase digestion site between the SVN peptides and their 
fusion protein partner provided an easy way to separate 
SVN mutants from the thioredoxin. Using an SVN purifica- 
tion protocol reported previously [21], purified SVN mutants 
were obtained and their purities and molecular weights 
were confirmed by LC-MS, the yield for SVN mutants 
was also included (Table 1). Although SDl and SD2 have 
similar sequences with 75% homology, the SDS-PAGE 
mobilities of the peptides were different (Fig. 3A, Table 1). 
SDl mainly existed in monomerlc form, whereas SD2 
appears in the form of dimer. The Western blot analysis 
showed that purified SDl and SD2 samples were both still 
recognized by anti-SVN polyclonal antibody (Fig. 3B), Addi- 
tional dot blot analysis indicated that SDl, SD2 and SVN(1- 
40) had modestly reduced binding affinity for SVN poly- 
clonal antibodies when compared with SVN (data not 
shown). 

3.2, Disulfide bond pattern of SDl 

The disulfide bond pattern of native SVN was previously 
reported [3]. Recombinant SDl showed the expected mole- 
cular weight of 4944.3 Da by LC-MS. When SDl was reduced 
with 10 mM p-mercaptoethanol, its molecular weight 
increased to 4948.2, a 4 Da increase which demonstrated that 
SDl had two disulfide bonds. 
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(A) 1 2 3 (B) 4 5 6 

Fig. 3 - SDS-PAGE and immunoblot analysis of purified 
SVN, SDl and SD2. (A) Coomassie blue-stained gel of the 
purified peptides. (B) Immunoblot analysis of an identical 
gel with rabbit anti-scytovirin polyclonal antibodies [Lane 
1: SVN, Lane 2: SDl and Lane 3: SD2]. 



LC-MS data on the trypsin digest fragments of non-reduced 
SDl and SD2 and peptide recognition software [1] were used to 
identify the disulfide bonds present in SDl and SD2 (Fig. 4). The 
molecular weight of one trypsin-digested fragment of recom- 
binant SDl (m/z = 3154.0 Da) was first input into the software 
to establish the disulfide bond pattern. This mass matched 
only the calculated mass for the peptide extending from SDl 
residue 1 to SDl residue 30 bearing an intact disulfide linkage 
between the only two Cys residues present in that fragment, 
Cys20-Cys26. The only other possible disulfide bond in SDl 
was between Cys32 and Cys38. This bond was confirmed by 
mass data for the peptide fragment extending from residue 31 
to residue 48 (MW= 1790.2Da) which matched that expected 
for the second disulfide bond. No evidence was seen for 
different disulfide-bonding patterns between alternate pairs 
of Cys residues in SDl. This confirmed that the disulfide bond 



SD1 (1-48)Cys7Ser: GSGPTYSWNEANNPGGPNRCSNNKQCDGARTCSSSGFCQGTSRKPDPG 

SD2(49-95)Cys55Ser: PKGPTYSWDEAKNPGGPNRCSNSKQCDGARTCSSSGFCQGTA6HAAA 

SVN(1 ^0)Cys7Ser: GSGPTYSWNEANNPGGPNRCSNNKQCDGARTCSSSGFCQG 

SVN<3-45)Cys7Ser: GPTySWNEANNPGGPNRCSNNKQCDGARTCSSSGFCQGTSRKP 

SVN(6^5)Cys7Ser: YSWNEANNPGGPNRCSNNKQCDGARXbsSSGFCQGTSRKP 

SVlM(11-45): ANNPGGPNRCSNNKQCDGARTCSSSGFCQGTSRKP 



Fig. 2 - Amino acids sequences of SVN mutants. The cysteines at positions 7 and 55 of the SVN sequence have been mutated 
into serine. 
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^^^^m^^^^^^mi^B^Mm^m^^^ svn peptides^- 



Protein 

SVN 

SDl (SVN(l-^8)Cys7Ser) 

SD2 (SVN(49-95)Cys55Ser) 

SVN{l-40)Cys7Ser 

SVN{3-45)Cys7Ser 

SVN{6-45)Gys7Ser 

SVN(lM5)Cys7Ser 



rEK cleavage efficiency (%) Yield (mg^L LB medium) Molecular weight^ ECso^ (nM) 



95 
98 
85 
96 
90 
85 



10 
2.5 
1.7 
22 
1.3 
0.8 
1.9 



9712.2 
4944.3 
4759.8 
4126.4 
4535 3 
4281.0 
3601.1 



^ Detemiined by electrospray mass sprectrometry. 

^ Determined against HIV-Irf in a T-lymphocyte cell cytoprotection assay. 



4.5 
7.6 
182 
34 5 
>2200 
>2200 
>2200 



arrangement for SDl was the same as found in native SVN [3]. 
In a similar manner to SDl, the disulfide bond pattern of 
recombinant SD2 was established using the trypsin digest 
fragments corresponding to residues 1-30 (m/z = 3235.8 Da) 
and residues 31-^7 (m/z = 1553.0 Da) of SD2 (Fig. 4). 

3.3. Anti-HIV activity of SVN mutants 

Native SVN has been reported to display potent antiviral 
activity against laboratory strains and primary isolates of HIV- 
1 with EC50 values ranging from 0.3 to 22 nM [3]. In side-by-side 
in vitro anti-HFV assays using CEM-SS host cells and HIV-Irf, 
the recombinant SVN peptide's anti-HIV activities were 
evaluated and compared to full-length SVN (Table 1). Anti- 
HIV activity comparable to the full-length SVN was retained in 
SDl, while SD2 showed about a 40-fold decrease in anti-HIV 
potency. 

Additional experiments examined whether the N- and C- 
terminus of SDl were necessary for antiviral activity, A series 
of deletion mutants were constructed in which 2, 5 or 10 amino 
acids were deleted from the N-terminus of SDl, or 8 amino 
acids were deleted from the C-terminus of SDl. Deletion of 2, 5 
or 10 N-terminal amino acids completely eliminated antiviral 
activity, indicating that N-terminal amino acids of SDl are 
necessary for maintaining the antiviral activity of SDl. 
Deletion of the eight C-terminal amino acids resulted in a 
three- to seven-fold decrease in anti-HIV potentcy, indicating 



that the eight C-terminal amino acids are necessary for 
optimal activity but not as integral as the N-terminal amino 
adds. 

3.4. gpl20 and gp41 binding pattern of SVN mutants 

We have previously reported that SVN binds to the HIV-1 
viral envelope proteins gpl20, gp41 and gpl60 [3], In order to 
compare the binding activity of SVN-derived peptides to 
SVN for their ability to bind to gpl20 and gp41, ELISA 
experiments were performed. The results showed that SDl 
appears to have gpl20- and gp41-binding activity similar to 
intact SVN (Fig. 5). In contrast, SD2 has reduced gpl20- 
('^57%) and gp41- (^-44%) binding ability relative to full- 
length SVN. The binding activity of SVN(l-40) with gpl20 
and gp41 is about 44 and 70%, respectively, relative to SVN, 
while the N-terminal mutants have almost no detectable 
binding activity with gpl20 and gp41 (Fig. 6). An additional 
ELISA experiment showed that 100 mM concentrations of 
glucose, mannose, galactose, xylose, N-acytylglucosamine 
and a-acid glycoprotein with sialic acid did not inhibit 
SVN-derived peptides (SDl, SD2 and SVN(l-40)) binding to 
gpl20 and gp41 (data not shown). Finally, an ELISA 
experiment demonstrated a concentration- dependent 
decrease in the binding of SDl to gp41 as a result of co- 
incubation with the high-mannose oligosaccharide oligo- 
mannose-8 (Fig. 7). 



SD1(1-48)Cys7Ser: 

=4948.2 Da GSGPTYSWNEANNPGGPNRCSNNKQCDGARTCSSSGFCQGTSRKPDPG 

2 disulfide bonds 

m/z = 3154.0 Da GSGPTYSWNEANNPGGPNRCSNNKQCDGAR 

Cys20-Cys26 



/77/z = 1790.2 Da TCSSSGFCQGTSRKPDPG 
Cys32-Cys38 



SD2(49-95)Cys55Ser: 

m/z = 3235.8 Da PKGPTYSWDEAKNPGGPNRCSNSKQCDGAR 
Cys20-Cys26 

m/2= 1553.0 Da TCSSSGFCQGTAGHAAA 
Cys32-Cys38 



Fig. 4 - Disulfide-bonding pattern of recombinant SDl and SD2. The values of tzypsin-digested SDl or SD2 were input into 
the software (http://sxl02a.niddk.nih.gov/peptide/) to establish the disulfide-bonding pattern. 
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Fig. 5 - ELISA study of the concentration-dependent 
binding of recombinant SVN or SDl to (A) gpl20 and (B) 
gp41. Glycosylated gpl20 or gp4X was bound to an ELISA 
plate and then treated with SVN or SDl. The amount of 
bound SVN or SDl was determined by absorbance at 
450 nm as described in Section 2. Results are the average 
absorbance at 450 nm (±S.D.) from triplicate weUs. 



4. Discussion 

The anti-HIV protein scytovirin is a novel antiviral protein 
from a cultured cyanobacterium with little homology to any 
known protein [3]. SVN does, however, show a strong internal 
sequence homology leading to the hypothesis that the protein 
is made up of two putative domains from residue 1-48 
(scytovirin domain 1, SDl) and from 49 to 95 (scytovirin 
domain 2, SD2) (Fig. 1) [3]. Here we tested that hypothesis by 
cloning and expressing SDl, SD2 and a series of SDl truncation 
mutants in an E. coli expression vector. The use of the 
thioredoxin expression system, necessary for the recombi- 
nant production of biologically active SVN [21], was again 
successful in producing properly folded, biologically active 
peptides with the appropriate disulfide bonds (Fig. 4). Surpris- 
ingly, SDl displayed eqmvalent antiviral activity, gpl20- and 
gp41-binding activities relative to SVN (Fig. 5). SD2 however, 
displayed significantly lower anti-HIV activity than SVN (40- 
fold higher EC50) and ~50% reduced binding to gpl20. 
Additional truncation of SDl at the carboxy-terminus by eight 
amino acids [SVN(l-40)] resulted in a significant decrease in 
both HIV-1 envelope glycoprotein binding and anti-HIV 
activity while any additional truncation of SDl at the N- 
terminus eliminated biological activity. 

The fact that the two domains of SVN differ so significantly 
in their activities was not expected. Additional structural 
studies will be crucial for elucidating the exact conformational 
determinants responsible for the antiviral activity of SVN. 
Preliminary NMR results have indicated that the SDl and SD2 
domains have some elements of structural similarity (Drs. 
McFeeters and Byrd, personal communication), but their anti- 
HIV activities, gpl20 and gp41 binding activities are quite 
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Fig. 6 - ELISA study of recombinant SVN and mutants to (A) 
gpl20 and (B) gp41. Glycosylated gpa20 or gp41 was bound 
to an ELISA plate and then treated with SVN or mutants. 
The amount of bound SVN or mutants was determined by 
absorbance at 450 nm as described in the Experimental 
section. All data are corrected for background antibody 
absorption in the absence of captured protein 
(typically < 0.1 OD450). Results are the average absorbance 
at 450 nm (±S.D.) from triplicate wells. 
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Fig. 7 - Concentration-dependent inhibition of SDl binding 
to HIV envelope protdin gp41 by the high-mannose 
oligosaccharide Man-8. Points are the averages ± standard 
deviations of three replicate determinations. 
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different. Some insights into these differences can be inferred 
from the data on the additional truncation mutants and the 
preliminary NMR data. The information on N-terminal 
truncations of SDl and SAm(l-40) indicates that the N- 
terminal amino acids of SVN are key for the envelope 
glycoprotein binding and anti-HIV activity of SDl. As the first 
two amino acids Pro49 and LysSO in SD2 differ significantly 
from the Glyl and Ser2 in SDl the loss of biological activity 
following their deletion is non-intuitive and indicated differ- 
ing structural importance for the amino acids in SDl vs, those 
in SD2. Initial data from the NMR studies indicated that the 
Glyl-Ser2 in SDl are integral to the structural integrity of 
domain 1 of SVN and in close proximity to the oligosaccharide 
binding site while the Pro49-Lys50 in SD2 appear to be part of a 
hinge region between the two domains of SVN and not of such 
structural importance (Drs. McFeeters and Byrd, personal 
communication). The further truncation of SDl by removal of 
the eight amino acids at the C-terminus resulted in a 
significant reduction but not a complete loss of activity. 
Further deletion of C-terminal amino acids were not 
performed, since this would disrupt the second disulfide bond 
of SDl which likely has a substantial role in defining the 
conformation and, thus, the biological activity of the peptide. 
Taken together with the sequence overlap, the data suggest 
that the C-terminal eight amino acids of SDl and N-terminal 
six amino acids of SD2, regions of low sequence homology, 
may comprise a flexible region separating two oligosacchar- 
ide-binding domains of SVN. 

SVN has previously been shown to bind to HIV-1 envelope 
glycoproteins through specific interactions with substructures 
of high-mannose oligosaccharides [2,3]. In particular, SVN was 
reported to bind specifically to a a-1-2, al-2, al-6-linked 
tetramannoside [2], a substructure of oligomannose-8 and -9 
that are found on the HIV envelope glycoproteins gpl20 and 
gp41. It was therefore important to determine whether or not 
this level of specificity was replicated in the single domain 
SVN~derived peptides. To test this, an EUSA experiment was 
performed in the presence of high concentrations of six 
different monosaccharides to determine if they could inhibit 
binding to HIV-1 gpl20. None of the sugars was able to inhibit 
the binding of SDl, SD2 or SVN(l-^O) to gpl20 (data not shown) 
thereby indicating that the individual domains retained some 
level of carbohydrate specificity and were not acting as 
monosaccharide specific lectins. This result was buttressed 
by additional experiments showing that oligomannose-8 was 
able to inhibit the binding of SDl to gp41 in a concentration- 
dependent manner (Fig. 7). The hypothesis that SVN specificity 
resided in the individual domains of SVN was also later 
confirmed by NMR analysis of a titration of the SVN-binding 
tetrasaccharide [2] into SVN (Drs. McFeeters and Byrd, 
personal communication). This level of oligosaccharide 
specificity in such a relatively small peptide is unique to 
SDl among lectins. The molecular interactions necessaiy to 
achieve this selective binding are currentiy being studied. 

The development of HIV fusion inhibitors has focused on 
three strategies: (1) blocking the interaction of gpl20 with the 
cellular receptor CD4, (2) blocking the secondary interaction of 
gpl20 with the cellular co-receptors CCR5 or CXC4, or (3) 
disrupting the formation of the six-helix bundle of fusion- 
active gp41 [19,20]. T-20, a member of the latter group was has 



been approved by the U.S. Food and Drug Administration for 
treatment of infection by HIV-1 [6,13,14]. Antibodies binding to 
gpl20 such as Pro 542 (a fusion protein including 4 IgG2 
molecules in which the variable fragments of both light chains 
are substituted with the D1/D2 domains of CD4) [11] and 
peptides such as CD4M33 (a 27-amino acid peptide mimicking 
the CD4 domain Dl) [15] are also in clinical trials. The peptide 
SDl with essentially the same anti-HIV and gpl20 binding 
activities as full-length SVN represents a distinct strategy for 
an anti-HIV therapeutic or prophylactic agent. The advantages 
of the SDl peptide are significant. SDl has 3 fewer disulfide 
bonds than SVN and also potentially less immunogenicity due 
to its decreased size. Furthermore, SDl is within the size range 
of peptides easily synthesized using automated techniques, 
which provides an alternative to biological production. These 
advantages, in addition to its broad spectrum of activity 
against HIV-1 viruses, low nanomolar activity and physical 
stability, provide ample reason for the continued study of 
SDl for its potential utility as an anti-HIV microbicide or 
therapeutic. 
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The solution structure of the potent 95 residue anti-HIV protein scytovlrin 
has been determined and two* carbohydrate-binding sites have been 
identified. Hiis unique protein, containing five structurally important 
disulfide bonds, demonstrates a novel fold with no elements of extended 
regular secondary structure. Scytovlrin contains two 39 residue sequence 
repeats, differing in only three amino acid residues, and each repeat has 
primary sequence similarity to diitin binding proteins. Both sequence 
repeats form similarly structured domains, with the exception of one region. 
The result is two carbohydrate-binding sites witli substantially different 
affinities. Tlie unusual fold clusters aromatic residues in both sites, 
suggesting a binding mechanism similar to otlier known heveki-Uke 
carbohydrate-binding proteins but differing in carbohydrate specificity. 
Scytovirin, originally isolated from the cyanobacterium Scytonema varium, 
holds potential as an HIV entry inliibitor for both therapeutic and 
prophylactic anti-HIV applications. Hie high-resolution structural studies 
reported are an important initial step in imlocking the therapeutic potential 
of scytovirin. 

© 2007 Elsevier Ltd. AH rights reserved. 
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Introduction 

Protein: carbohydrate interactions play fundamen- 
tal roles throughout biology. Bacterial infection, cell 
growth, inflammation, cell mobility, fertilization. 
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cell to cell adhesion, and viral infection are all 
influenced by protein:carbohydrate interactions. 
Therefore proteinrcarbohydrate interactions are a 
topic of major interest throughout science. One 
well-studied group of lectins is the hevein-IQce 
family. This family is structurally tinited by the 
presence of similar chitin-binding domains (CBDs) 
and are members of carbohydrate-binding module 
family 18.^ Hevein, isolated from the latex of 
Hevea brasiliensis, is composed of 43 residues and 
its structure has been Imown since 1991.^ Other 
hevein-like CBDs vary in length, t)rpically from 38- 
45 residues, and are often found in multiple repeats 
within a single protein. A dual CBD-Mke motif is 
present in the potent antiviral protein scytovirin, 
originally isolated from the cyanobacterium Scyto- 
nema varium. 
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Scytovirin is of particular interest due to the fact 
that it possesses potent and broad spectrum 
antiviral, including anti-HIV, activity not found in 
the other hevein-like domain proteins.^ Scytovirin 
exliibits low nanomolar activity against HIV and 
binds to the HIV envelope glycoproteins gpl60, 
gpl20 and gp41, but it does not bind to the T-cell 
extracellular CD4 receptor or other common ceU 
surface proteins.^ The interaction is carbohydrate- 
dependent, with scytoviriQ most readily binding to 
glycosylated gpl20. Scytovirin's ability to comple- 
tely inactivate laboratory strains and primary iso- 
lates of fflV-1^ makes it a promising candidate for 
anti-HIV microbicide development. 

Scytovirin contains two 39 residue repeats, which 
exhibit rather low similarity to a subset of diverse 
CBD-containing proteins.'* The initial sequence 
analysis offers Httle indication of the mechanism of 
antiviral activity beyond similarity to CBDs. The 
comparison to hevein-like CBDs was not originally 
examined; however, the sequence similarity be- 
tween the scytovirin repeats and the hevein domain 
is approximately 30%. In addition, several features 
common to hevein family members*^'^ are present in 
scytovirin: There are two CBD-like domains sepa- 
rated by a proHne-rich Mnker. Both scytovirin and 
hevein-like CBDs have a high percentage of cysteine 
residues, all participating in disulfide bonds. Each 
domain of scytovirin has three aromatic residues, 
similar to the conserved triad of aromatic residues 
foimd in other hevein-like CBDs. Thus, it seems 
plausible that scytovirin is another hevein family 
member. 

However, scytovirin shows distinct differences 
from tlie hevein-lOce proteins. First, scytovirin does 
not bind chitin."^ In fact, scytovirin shows a very 
restrictive carbohydrate binding specificity, having 
no measurable affinity for monosaccharides or 
common trisaccharide cores from higher order 
mannose oligosaccharides^''' {vide infra). Tliis obser- 
vation is in stark contrast to tlie increasing affinity 
for longer oligosaccharides reported for other 
hevein-Hke CBDs." Second, the disulfide bonds 
among the conserved cysteine residues of scytovirin 
are shuffled compared to tlie consensus hevein 
arrangement. Scytovirin has only five cysteine 
residues per domain. These align exactly witii five 
of file eight cysteine residues per domain in the 
classical hevein arrangement. The presence of an 
odd number of cysteine residues in each domain of 
scytovirin causes the disulfide bonds to rearrange, 
resulting in two disulfides per domain and one 
inter-domain disulfide,^ unlike the four disulfide 
bonds t3^icaUy fomid in hevein domains. Although 
scytovirin has tliree aromatic residues per domain, 
similar to hevein-like CBDs, the sequence arrange- 
ment of the aromatic residues does not match Sie 
classic hevein domain sequential arrangement, vide 
infra. 

Thus, scytovirin has remained a rather enigmatic, 
but potent antiviral protein. In order to better 
imderstand the antiviral mechanism and enable 
development of this natural protein, we have 



determined flie three-dimensional solution structure 
of recombinant scytovirin and found it to have a 
novel fold, quite different from hevein. We have also 
examined the carbohydrate-binding properties with 
short oligomannose carbohydrates, which were 
previously identified.*^ These structural studies are 
a vital part of the future development for scytovirin 
and detail tlie unique characteristics of this protein, 
botli witli respect to its antiviral activity and its 
relationship to other well-studied carbohydrate- 
binding proteins. 



Results 

Chemical shift and NOE assignments 

Chemical sliift assignments of recombinant scyto- 
virin (Figure 1(a)) were complicated by resonance 
degeneracy. A doubling of most peaks was imme- 
diately apparent from 3ie ^^N-heteronuclear single 
quantum coherence (HSQC) (Figure 1(b)). Duplicity 
of the amide backbone chemical shifts for a majority 
of the resonances readily suggested that the se- 
quence repeats of scytovirin have similar backbone 
folds. There was an indication that each sequence 
repeat might represent a separate domain of the 
total protein structure, and we refer to the two 
sequence repeats as domains, which is borne out in 
tlie complete analyses {vide infra). Chemical shift 
degeneracy was exacerbated for side-chains where 
■^^C and ^H resonances were not as well dispersed as 
^^N chemical shifts. For illustration, strip plots for a 
stretch of five consecutive residues in each domain, 
showing tlie and C^ resonances and sequential 
correlations from triple resonance NMR spectra, are 
shown in Figure 1(c). 

The three amino acid differences between 
sequence repeats of scytovirin aided the sequential 
assignment process; nevertheless, meticulous man- 
ual analysis of standard triple resonance spectra 
had to be supplemented with additional experi- 
ments to unambiguously assign backbone reso- 
nances. An HMQC-NOESY-HMQC,^ twice taking 
advantage of ^^N chemical shift dispersion, greatly 
aided the resolution of inter- versus intra-domain 
backbone sequential assignments. Pulse sequences 
to assign through proline resonances^° were also 
used to confirm assignments. Information from 
these experiments was especially useful for tlie P- 
G-G-P sequence in each domain and the P-D-P-G-P 
sequence in the linker. Manual analysis of nuclear 
Overhauser effect spectroscopy (NOESY) data was 
also utilized to help resolve chemical shift ambi- 
guity. In summary, nearly complete domain-spe- 
ciBc resonance assignments were obtained for the 
first 94 residues of scytovirui. Residue 95, the C- 
terminal alanine, was not observable. Over 98% of 
backbone and 92% of side-diain resonances were 
accounted for. As expected, each ^^N-HSQC peak 
pair corresponded to complementary resonances of 
the 39 residue sequence repeats. Chemical shift 
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Figure 1. Scytovirin sequence, resonance assignments, and chemical shift indices, (a) Amino acid sequence of 
scytovirin aligned to show sequence repeats. The background of the first domain (SDl) is shaded in red while the 
background of the second domain (SD2) is shaded in blue. Black lines indicate disulfide bonds. Green boxes highlight the 
three natural sequence differences between domains, (b) An ^®N-HSQC of scytovirin illustrates peak doubling of 
complementary resonances. Residues S21, N22, N23, K24, and Q25 firom SDl are labeled in red and complementary 
residues S69, N70, S71, K72, and Q73 from SD2 are labeled in blue, (c) SHces from HNCACB and CBCA(CO)NH spectra 
for tlie same ten residues as (b). HNCACB C" resonances are shown in red and C^ resonances in blue. CBCA(CO)]>M 
resonances, from previous residue C" and C^ resonances, are shown in magenta. Background shading is red for SDl and 
blue for SD2. (d) The difference in (top) and C^ chemical shifts from random coil values. No extended regions of 
regular secondary structure are expected, since the characteristic positive difiference for a-heiices and negative difference 
for ji-sheets is not observed for C** chemical shifts. 



analysis of backbone and resonances^^'^^ 
(Figure 1(d)) gave the first indication that no 
extended regions of regular secondary structure 

were present. 

NOESY assignments v^ere also complicated by 
chemical shift degeneracy, especially for inter- versus 
intra-domain NOEs. No symmetry for the two 
domains was assumed and an initial structure 
was calculated from a sparse set of unambiguously 
assigned NOEs. A moderate resolution structure 
was obtained and a degree of 2-fold symmetry was 
immediately apparent. The initial structure allowed 
for resolution of more inter- versus intra-domain 
NOE ambiguity. After multiple iteratior\s, 1205 



intraresidue, 1006 sequential, and 634 long range 
NOEs were assigned (Table 1). 

Structural overview and backbone flexibility 

The solution structure of scytovirin was deter- 
mined using NOEs and residual dipolar coupling 
data ^^ (RDCs). Initial structures, determined using 
only NOEs, exhibited a backbone RMSD of 1.4(±0.1) 
A to the mean structure. The sequence repeats cor- 
responded to structured domains with residues 3-43 
forming structural domain 1 (SDl) and 51-89 
forming structural domain 2 (SD2). The disulfide 
bonding pattern, previously determined from pro- 



454 



Attachment B 

The Novel Fold of Scytovirin 



Table 1. NMR and refinement statistics for scytovirin 



Distance co7istmints 

Total NOE 2905 
Intra-residue 1247 
Inter-residue 1658 
Sequential ( I /-/ i = 1 ) 679 
Medium-range ( I /-/ 1 < 4) 324 
Long-range ( i I > 5) 655 
Total RDCs 148 
NH 82 
H^C''' 66 
Q (%) 0.25 

Structure statistics 

Violations (mean and s.d.) 
Distance constraints > 0.5 A (A) 24±2 

Average violation > 0.5 A (A) 0.69±0.16 
Max. distance constraint violation (A) 1.17 

Deviations from idealized geometry 

Bond lengtlis (A) Q.01±0.06 

Bond angles (°) 1.59±2.55 

Impropers (°) 0.08±2.19 

Average r.m.s.d. to mean" (A) 

Heavy 0.42±0.11 

Backbone 0.51^=0,11 



" Calculated from 20 refined structures. 



teolysis and peptide mapping,^ was readily appar- 
ent in these initial structures. Structure calculations 

with and without disulfide restraints showed only 
small differences, and subsequent calculations 
included these restraints. Backbone RMSDs for the 
domains, aligned individually, were 1.0 A for SDl 
and 1.3 A for SD2. More striking than the domain 
similarity was the complete absence of regular sec- 
ondary structure. In agreement wiih chemical shift 
indexmg data, no extended a-helices or ^-sheets 
were observed. Circular dichroism (Supplementary 
Data Figure 1) also indicates an atypical fold. The CD 
spectrum for scytovirin does not exhibit an a-helical, 
p-sheet, mixed helix/ sheet, or unfolded profile. 
Refinement against /-modulated H^'C** and NH 
RDCs confirmed the tmusual fold and total lack of 
regular secondary structure. Two regions of 
extended structure are superficially similar to 
sheets. In SDl, residues 28-32 lie antiparallel to 
residues 38-42. Also, residues 6-12 lie antiparallel to 
the disjoint sections 17-^20 and 26-28. Although 
reminiscent of p -sheets, neither Phi/Psi angles nor 
chemical shift indexing strictly indicate the presence 
of classical [i-sheet folds. Ftir&iermore, temperature 
coefficients of the amide resonances^"^ are not 
consistent with hydrogen bonding patterns expected 
for antiparallel (i-sheets. Tlie same holds for com- 
plementary residues in SD2. Additionally, there are 
no He, Leu, or Val residues in the sequence, which 
results in a very minimal hydrophobic core. The 
limited hydrophobic core within each domain is 
consistent witli tlie unusual fold. Thus, scytovirin 
has no recognizable a-helix or p-sheet elements of 
secondary structure. 

Tlie lowest energy 20 refined scytovirin structures 
are shown in Figure 2 and coordinates are deposited 
imder PDB accession number 2JMV. In the final 



structures, the backbone RMSD of the entire protein 
improved to 0.42(±0.11) A (Table 1). Backbone 
RMSDs of the individual domains improved to 
0.25 A for SDl and 0.40 A for SD2. In addition to 
unproved precision, RDC refinement helped define 
tlie linker and last loop-like structure of each 
domain. Calculated versus observed RDC plots and 
cross-vaHdation are included as Supplementary 
Data Figure 2. A DALI^^ structure seardi fotmd no 
similar structures for either SDl, SD2 or full-length 
scytovirin. Thus scytovirin represents a truly novel 
fold. 

Despite the lack of regtdar secondary structure, 
scytovirki appears to be quite ordered on NMR 
measurable timescales. Transverse and longitudinal 
^^N relaxation rates showed no regions of significant 
variation except in the C terminus. The correlation 
time derived from tLiese data was 5.1 ns, which cor- 
responds to a monomeric 9.7 kDa protein. Hetero- 
nuclear NOEs at multiple field strengths resulted 
in average values (excluding tlie two C-termkial 
residues) of 0.80±0.05 and 0.80±0.07 at 600 and 
800 MHz, respectively. Data are included as 
Supplementary Figure 3. No significant change in 
average NOE is observed when calculated sepa- 
rately for the individual domains. It is apparent that 
the backbone of scytovirin does not e?diibit flexi- 
bility, except the last two C-terminal residues. The N 
terminus is well ordered, beginning at residue 1, and 
participates in structuring of SDl. The interactions 
of residues Gl and S2 with C26, D27, and R30 
contribute to the structural difference between 
domains. Hydrogen/ deuterium exchange experi- 
ments show that 17 residues in each domain exhibit 
protected amide protons. All of the protected 
amides were on the interior of the protein. Some of 




Figure 2. Solution structure of scytovirin. Backbone 
traces of lowest energy 20 sc)^ovirin structures of 200 
calculated from refinement against H"C" and NH RDCs. 
SDl is colored red, SD2 blue, disulfide bonds in yellow, 
and the linker and termini gray. 
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these amides must participate in hydrogen bonds; 
however, due to fiie lack of regular secondary 
structure it was not possible to unambiguously 
assign hydrogen bonding partners. Therefore, the 
structures were calculated without such constraints. 

Domain interface 

Hie interface between the SDl and SD2 lies in the 
middle of the protein and helps stabilize tlie fold of 
eacli domain. The structure of each domain was 
calculated separately, using intra-domain NOEs. 
The backbone RMSDs worsened due to the loss of 
inter-domain NOEs, but the same overall folds were 
recognized. One end of the interface is cross-linked 
by the disulfide bond between C7 and C55. Multiple 
tmambiguous NOEs, most notable T5 HG to T53 
HG, substantiate tlie presence of the distilfide bond. 
The remainder of the interface is predominantly 
hydrophobic. A large number of unambiguous 
NOEs are observed between residue C20 and re- 
sidue Q73. The interface is --580 and may help 
stabilize the domain structures, in agreement widh 
initial studies of the individual domains (unpub- 
lished results). The linker does not appear to 
contribute significantly to the interface. 

Conserved structural elements 

As expected from the similarity of chemical shifts 
in each domain, the amino acid sequence repeats 
have similar overall backbone folds; however, die 
RMSD between SDl and SD2 is 5 A. Close in- 
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spection reveals a conformational change that ex- 
plains the difference between the domains. Before 

discussion of the differences, it is beneficial to 
imderstand the likenesses. 

Three discreet, common structural elements are 
apparent in both domains of scy tovirrn (Figure 3(a)). 
The first conserved structural element, tlie "top 
loop," is formed by residues 12-17(60-65) of SDl 
(SD2) (Figure 3(b)). The top loop circumnavigates 
approximately 240° of a complete circle. The first 
position of the top loop is the site of one of the 
three sequence differences (N12/K60) between SDl 
and SD2, and the rest of the loop is formed by a 
P-G-G-P sequence. Both top loops exhibit a similar 
pattern of heteronuclear NOEs at 600 and 
800 MHz. The first glycine (G15/G63) in the loop 
has an average NOE value of 0.77±0.01, whereas 
the second glycine (G16/G64) has a value of 
0.84±0.02. The first glycine is closer to the 
solvent-exposed apex of ihe loop and therefore it 
is reasonable that it is slightly less conformationally 
restricted. The second conserved structural ele- 
ment, denoted the "middle turn," is formed by 
residues 22-25(70-73) (Figure 3(c)). A disulfide 
bond between cysteine residues 20(68) and 26(74) 
helps demark ilie boundaries of this conserved 
structural element. In SDl, a probable hydrogen 
bond between the backbone carbonyl oxygen of 
residue 22 and the backbone amide proton of 
residue 25 produces a likeness to a classical turn. 
However, unlike a classical turn, the last residue of 
both middle txims introduces an abrupt change in 
direction. The change in direction is followed by a 




Figure 3, Three major sub-structural elements in scytovirin. (a) Twenty member ensemble of SDl colored to show the 
three major loop-like elements conserved in both domains in red. See the text for discussion, (b) The top loop, (c) middle 
turn and (d) bottom knot of SDl (red) and SD2 (blue) are aligned to illustrate the structural similarities and differences. 
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kiiiked extended region leading to the third 
conserved structural element and most distinguish- 
ing feature of scytovirin, denoted the "bottom 
knot," which is formed from residues 31-39(79-87) 
(Figure 3(d)). As witli the middle turn, a disulfide 
bond between residues 32-38 (80-86) helps define 
the boundaries of tills conserved structural ele- 
ment. Refinement against RDCs greatly improves 
resolution of this element in each domain, reveal- 
ing a qmte miusual fold. 

Domain differences 

Both SDl and SD2 are composed of the same 
structural elements; however, tiie relative orienta- 
tions of the elements wifliin each domain are not the 
same, resulting in tite 5 A RMSD between SDl and 
SD2. The major differences are centered around the 
middle turn. In SD2, tiie middle turn is flipped 180° 
with respect to tlie rest of the domain when 
compared to the middle turn of SDl (Figure 3(c)). 
Backbone chemical sliift differences between SDl 
and SD2 are greatest for residues in flie middle turn 
and tlie extended region directly following (Figure 
4). Interactions between residues Gl and C26, D27, 
R30 of scytovirin are responsible for the difference, 
since the side-chains intimately pack in this region of 
SDl and help determine the orientation of the ttim. 
Tlie same does not hold for SD2, since residues P49 
and K50 in the linker, equivalent in position to Gl 
and S2 for SDl, pack and interact differently with tlae 
residues near the middle turn of SD2. Residues 
preceding P49 in the linker, which are absent at the N 
terminus, may also contribute to tlie differences. Hie 
final result is a flipped orientation of the middle turn. 
The structural significance of the two N-terminal 
residues, Gl and S2, is supported by the decrease in 
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Figure 4. Amide backbone chemical shift difference 
between domains. The absolute difference of amide 
backbone diemical shifts for equivalent residues in SDl 
and SD2 are plotted. contributions have been scaled to 
^^N ppm. Asterisks mark the position of proline residues. 
At liie top, boxes demark residues composing the top 
loop, middle turn, and bottom knot 



anti-HIV EC50 by greater than a factor of 500 when 
these two residues are removed. 

As a consequence of the reversed middle turn, 
the top loop is also reversed between domains. The 
right-handedness of SDl is mirrored by the left- 
liaiidedness of SD2. Changes in side-cliain packing 
caused by the natural residue differences of N9 to 
D57 before the top loop and N12 to K60 at the 
beginning of the top loop contribute to the reversed 
handedness. Differences also occur in hydrophobic 
contributions at position A11/A59. For SDl, tlie 
domain interface provides some protection for the 
All. For A59 of SD2, protection by tlie interface is 
not available so tlie hydrophobic side-chain buries 
in the domain, positioning the residue much 
differently. These relatively small effects combine 
to cause the regions between the top loop and 
middle turn to be different between SDl and SD2. 
These conformations are confirmed by tlie ob- 
served RDCs for these regions, and the difference 
leads to the large backbone RMSD between tlie two 
domains. 

The backbone folds of SDl and SD2 are quite 
similar, except for the region between the top loop 
and middle turn. The concerted change in handed- 
ness of the top loop and 180° reversal of the middle 
turn effectively cancel each other, preserving tlie 
remainder of liie overall fold. The backbone RMSD 
of the two domains is 2.5 A, when the top loop 
tlirougli tiie middle turn are left out of the cal- 
culation. The major structural differences are 
thereby relegated to only part of the protein, and 
do not dramatically impact residues near the 
carbohydrate-binding site {vide infra). This aUows 
scytovirin to position residues critical for carbohy- 
drate binding similarly between domains. Thus, 
both domains of scytovirin retain the ability to bind 
oligosaccharide, albeit with different affinities. 

Oligosaccharide binding 

Scytovirin has been shown to bind specific 
oligosaccliarides found on gp41 and gpl20.^ In 
particular, scytovirin was shown to bind to a 
specific tetrasaccharide substructure of the high 
maiinose oligosaccharide normally found de- 
corating HIV-1 envelope glycoproteins.^ The 
Mana(l— )-2)Mana (l-^6)Mana(l-^6)Man tetrasac- 
charide is pictured in Figure 5(a). Chemical shift 
perturbation mapping of scytovirin was used to 
monitor titration of this tetrasaccharide (see Figure 
5(b)). Different exchange kinetics were observed for 
residues in SDl versus residues in SD2. At 
500 MHz, residues in SDl exhibit intermediate 
exchange, whereas residues in SD2 exhibit fast 
exchange. These exchange regimes can be shifted 
by conducting the experiments at 800 MHz, where 
tiie intermediate exchange becomes slow and tlie 
fast exchange becomes intermediate (data not 
shown). The resxxlts suggest significantly different 
affinities for binding of tiie tetrasaccharide to eacli 
domain. Chemical shifts as a function of ligand 
concentration were fit to a two-site model (Matlab 
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Figure 5. Oligosaccharide titrations reveal binding site, (a) Structure of the tetrasaccharide used for NMR binding 
studies, where R is tiie same as in Adams et al7 (b) Overlay of sections from ^^N-HSQC spectra collected with increasing 
amounts of tetrasaccharide. Complementary residues showing the largest chemical shift perturbation are shown. 
Residues from SDl are in red and residues from SD2 are in blue. Color gradients from light to dark correspond to 
tetrasaccharide:scytovirin ratios of 0:1, 0.25:0,0.5:1,1:1/2:1. (c) Surface representation of scytoviria (SDl pink, SD2 light 
blue) colored to show residues having largest change in chemical shift upon tetrasaccharide binding (SDl red, SD2 blue). 



algoritlun kiiidly provided by Dr David Fushman, 
University of Maryland) providing dissociation 
constant estimates of 30 ]xM for SDl and 160 |liM 
for SD2 (Supplementary Data Figure 4). These 
measurements agree witli data measured from 
isothermal titration calorimetry (S. Shenoy & B. R. 
O'Keefe, personal communication). 

Residues that demonstrate backbone amide 
chemical shift changes upon addition of tetrasac- 
diaride are almost identical for SDl and SD2. 
Mapping those residiies onto the scytovirin struc- 
ture is shown in Figure 5(c). It appears that each 
domain of scytovirin binds oligosaccharide in a 
similar fashion. This agrees with biochemical 



studies of tlie individually expressed domains,^^ 
There are three aromatic residues in each domain 
(SDl, Y6, W8, F37; and SD2, Y54, W56, F85), and, 
with the exception of F85 in SD2, all the aromatic 
residues of scytovirin are clustered near the 
binding sites (Figure 5(c)). Both N^^H of tryp- 
tophan W8(SD1) and W56(SD2) demonstrate 
chemical shift perturbations, thus suggesting 
involvement in tetrasaccharide binding. The clus- 
tering of aromatics agrees well with an existing 
model in which two aromatic residues, separated by 
one residue in the amino acid sequence, play a 
critical role in one class of protein-oligosaccharide 
interactions.'^^ 
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Oligosaccharide-binding site comparison 

The differences between the two oligosaccharide- 
binding sites can partially be explained from flie 
apo structure of scytovirin. Hie N9 residue in SDl, 
wliicli is tlie site of side-cliain substitution for D57 in 
SD2, shows large backbone aniide ciiemical shift 
clianges in the presence of tetrasaccharide, implicat- 
ing involvement in binding. Hie negative charge of 
D57 may contribute to decreased oligosaccharide 
affinity in SD2. Also, the residues between the top 
loop and middle turn show minor ciiemical shift 
perturbations upon addition of tetrasaccharide, and 
some structural differences between the two do- 
mains may contribute to differences in binding. The 
proximity of Gl and S2 to surface-exposed residues 
with tiie largest ciiemical shift perturbations in SDl 
indicate the significance of the ordered N terminus 
and local structure that contribute to the different 
oligosaccharide-binding affinity. The lack of diemi- 
cai siiift perturbation for G76 in SD2 differs from the 
significant diange observed for tlie equivalent G28 
in SDl. Thus, even though the sites are quite similar, 
enough difference exists to cause an appreciable 
difference in oligosaccharide affinity. Furtiiermore, 
both sites exliibit considerable specificity. We have 
confirmed the previous observation that a related 
Mana(1^6)Mana(l~-^6)Man trisaccharide'' does 
not bind to scytovirin based on tlie absence of 
observed chemical shift perturbations in NMR ti- 
ti'ation experiments. 

Discussion 

At first glance, scytovirin appears to be another 
member of the hevein-like family, since it binds 
oligosaccharides, has a majority of the signature 
CBD cysteine residues^ (Figure 6(a)), and has a 
similar aromatic triad involved in carbohydrate 
binding. The likeness of scytovirin to CBDs, how- 
ever, does not extend to tiie backbone fold. Tiie 
scytovirin structure is quite distinct from that of 
the hevein domain (Figure 6(b)). The difference in 
structure is, in part, due to the different disulfide 
bond formation. The absence of a cysteine at po- 
sition 17 in scytovirin, strictly conserved in all 
hevein family members, causes a dramatic rearran- 
gement of disulfide bonding (see Figure 6(a))- In 
scytovirhi, the disulfide bonding pattern is C20-C26 
and C32-C38 in SDl, C68-C74 and C80-C86 in SD2 
wifli an inter-domain C7-C55 disulfide bond.* This 
corresponds to a sequential disulfide bonding pat- 
tern between the 2nd-3rd and 4th"5tli (7tii-8th, 9th~- 
lOfli in SD2) cysteine residues with one inter- 
domain disulfide bond, between the lst~6fh cysteine 
residues. The disulfide bonds are very important to 
tlie structural integrity of the two domains, and flie 
inter-domain disulfide bond assists in bringing tlie 
two domains together to form the overall globular 
structure. In ail hevein family members, tlie pattern 
of disulfide bonds is not sequential, but intertwined 
as evidenced from disulfide bonding between the 



conserved lst-4th, 2nd-5fli, 3rd-6fh, and 7th-8fli 
cysteine residues (Figure 6(a)). Tlie conserved 
distilfide pairing makes it reasonable to assume 
that the hevein domain structure is conserved across 
all family members, as represented by tlie B. amguin 
hevein domain (Figure 6(b)). The hevein domain 
exhibits short a~helical and (i-strand secondary 
structures, contrary to tlie absence of sudi structural 
elements in scytovirin. 

The sequence repeat, or dual hevein motif, of 
scytovirin is also seen in other CBDs. For example, a 
recently described tomato lectin shows the same 
duplication of hevein domains joined by proline- 
rich linkers.^ The tomato lectin has no inter-domain 
distilfide bond, hence there is no a priori expectation 
that the two domains would contact or interact witli 
one another. The dual motif would be anticipated to 
have a fold comprised of two typical hevein 
domains connected by a linker, as suggested by 
the model in Figure 6(b). Conversely, tlie inter- 
domain disulfide bond of scytovirin is reminiscent 
of mistletoe lectin where two CBDs are held together 
by a lone, inter-domain disxilfide bond.^ The con- 
served hevein disulfide pairing of the tomato lectin 
(Figure 6(a)) would suggest that the individual 
domains have the hevein fold and that fhey are 
simply tethered. It is not known if there is any 
contact or interaction between the two domains, as 
seen in scytovirin. This variation in domain struc- 
ture and intercormection might provide further 
differentiation in carbohydrate recognition specifi- 
city; however, further investigations are required to 
examine this possibility. 

Each domain of scytovirin contains a triad of 
aromatic residues that is similar to the triad in 
hevein domains and is known to be responsible for 
carbohydrate binding in hevein.^® Despite the 
sequence similarity to CBDs, scytovirin lacks mea- 
surable binding affinity for chitin, monosacdiarides, 
or common trisaccliarides."^'^ Scytovirin does not 
bind the GlcNAcp(l->4)GlcNAc structure of chitin 
but instead shows liigli specificity for a tetrasac- 
cliaride structure of Mana(l->2)Mana(l-~>6)Mana 
(1— >>6)Man. One plausible explanation for these 
differences is the altered arrangement of key 
aromatic residues in tiie binding sites of scytovirin 
compared to hevein domains. Interestingly, SDl of 
scytovirin preserves the structural arrangement of 
the aromatic triad (Y6, W8, and F37) and has a 
nearly identical affinity for its tetrasaccharide- 
binding partner as hevein does for (GlcNAc)4.^ 
SD2 e^dhibits a lower apparent bindmg affinity for 
the tetrasaccharide, and the aromatic triad (Y54, 
W56, and F85) is not as tightly clustered. The side- 
cliain of F85 is not readily able to participate in 
binding due to the reversed arrangement of the 
middle turn. Thus, we predict that scytovirin and 
hevein utilize similar protein: carbohydrate interac- 
tions; however a different specificity is achieved as a 
direct result of the altered disulfide bonding pattern 
and rearrangement of the aromatic triads in each 
domain. In binding experiments with the tetrasac- 
diaride, the two binding sites are independent with 
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Figure 6. Sequence alignment, disulfide bonding pattern, and domain comparison of scytovirin and hevein-domains. 
(a) An amino add sequence alignment of both scytovirin domains (SVN SDl and SD2), hevein (HEV), and the hevein-like 
tomato lectin N-terminal cliitin-binding modules^ (TL-CBMnl and TL-CBMn2) illustrates the drastically different 
disulfide bond formation of scytovirin. Disulfide bonds are indicated by lines. Gold boxes indicate cysteine residues 
conser\'ed in scytovirin and hevein-like proteins, whereas blue boxes indicate cysteine residues only found in hevein-like 
CBDs. (b) Comparison of tlie backbone folds of the dual CBDs of scytovirin (left) to a model of two arbitrarily oriented 
hevein domains (PDB ID: IHEV) (right). All cysteine residues side-chains are shown, colored in yellow. SVN SDl is 
shown in red and SVN SD2 is shown in blue. The model of a dual hevein-domain protein without any inter-domain 
disulfide bond (right) is formed from two copies of IHEV and the linker is represented as a broken line. 



relatively weak affimty. As noted for otlier CBDs,^ 
increased affinity is achieved by binding to large 
carbohydrates or glycoproteins that can deliver 
carbohydrates to both scytovirin domains, and this 
mtiltivalency is likely responsible for the liigh anti- 
HIV activity of scytovirin. 

Hie presence of two weak carbohydrate-binding 
sites that result m potent anti-viral activity towards 



gp41 and gpl20 parallels the observations for 
another cyanobacterial lectin protein with potent 
antiviral activity, cyanovirin-N.^^ Cyanovirrn-N 
exhibits two binding sites witli micromolar affinity, 
similar to scytovirm. The cyanovirin-N-binding 
sites, however, do not involve aromatic residues, 
based on both NMR titrations'^ and the crystal 
structure of the cyanovirin-N:Man-9 complex.'^ 
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Similar to scytovirin, cyaiiovirin~N exhibits very 
high affinity toward viral glycoprotein epitopes. A 
recent study of cyanovirin-N^^ observed precipita- 
tion of gp41 by cyanovirin~N that coiald be prevented 
by mutation of one of Hie binding sites, suggesting 
tliat tlie precipitation was due to cross-linking but 
concluding that the nanomolar potency was prob- 
ably not due to cross-linking. Instead, the multi- 
valency may be related to the spatial specificity of 
glycoprotein epitopes. Our data suggest that scyto- 
virin may act in a parallel fashion, but the nature of 
the carbohydrate-binding sites are different and lead 
to a distinct carbohydrate specificity compared to 
cyanovirin-N/ 

The structure of scytovirin introduces a novel fold 
and new twist to carbohydrate-binding proteins. 
Knowledge of the carbohydrate-binding sites pro- 
vides important opportunities for structural engi- 
neering to improve this molecule as an antiviral, and 
particularly anti-HIV agent. Further studies are in 
progress to better imderstand the protein:carbohy- 
drate complex. A greater imderstanding of how 
scytovirin binds to viral glycoprotein carbohydrates 
will contribute to advanced development of anti- 
viral therapies and potential topical microbicide 
prophylaxis. 



modulated pulse sequences"^^ were used to measure NH 
and H^C^ RDCs. Integer multiples of 0.70 and 1.0 ms were 
used for H"C" and NH coupling delays, respectively. 
Sixty-seven H^C"" and 82 NH couplings were used for 
structural refinement. Ramachandran statistics included 
13 dihedrai angles in the most favorable region, 32 in the 
additionally allowed, 19 generously allowed and five 
disallowed. 



Oligosaccharide titration 

Oligosaccharides were synthesized as reported.^^'^^ 
Titration data were collected at both 500 and 800 MHz 
in NMR buffer. Ratios of approximately 0:1, 0.5:1, 1:1 and 
2:1 were collected for tlie Mana(l-2)Manod(l-6)Mana(l-6) 
Man tetrasaccharide. Concentrations of 0:1, 0.5:1, 1:1, 2:1 
and 10:1 were collected for the non-binding Mana(l-6) 
Mana(l-6)Man trisaccharide. At 800 MHz, data were 
collected at temperatures of 10, 17.5 and 25 °C for 
temperature dependence of oligosaccharide binding and 
tentative indication of hydrogen bonding. 

Protein Data Bank accession code 

Coordinates have been deposited in the RCSB Protein 
Data Bank under accession number 2JMV. 



Materials and Methods 



Sample preparation and NMR data acquisition 

Scytovirin was expressed and purified as described."^ 
The NMR buffer used for aU experiments was 20 mM Mes 
(pH 5.5), 100 jLiM EDTA, 10 \iM NaNs- The concentration of 
scytovirin was 1 mM for the unbound sample and 0.5 mM 
for the sample used for carbohydrate titration. All data, 
unless oflierwise noted, were coUected at 25 °C. The 
titration, assignment, and ^^C-edited NOESY data were 
collected on Varian Inova spectrometers with cryogeni- 
cally cooled probes at field strengtihs of 500, 600 and 
800 MHz, respectively. ^^N-edited NOESY and proline 
sequential assignment data were collected at 600 MHz on 
a Bruker Avance spectrometer. "^^N-edited and -^^C-edited 
NOESY mixing times were 100 and 125 ms, respectively. 
Heteronudear^N-^H NOEs were collected with a recycle 
delay of 5.5 s or recycle delay of 2.5 s and a saturation time 
of 3 s. Hydrogen/deuterium exchange spectra were 
collected at times of 20 min, 40 min, 1.5 h, 4 h, 16 h, and 
four days after a lyophilized sample of scytovirin was 
resuspended in ^HaO. All data were processed using 
nmrPipe^ and visualized using SPARKY.^* 

Structure calculation and refinement against RDCs 

Structure calculations were conducted using XPLOR- 
NXH.^^"^ Due to the lack of extended secondary structure 
predicted from C^/C^* chemical sliifts, database calculated 
diliedral angle restraints were not used. A total of 1206 
mtraresidue, 1005 sequential, and 634 long range (defined 
as NOEs between protons of residues five or greater 
positions apart in the sequence) were used in the cal- 
culation. The lowest energy 20 structures of 200 calculated 
are reported. The precise and consistent suite of J- 
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The development of anti-human immunodeficiency vi- 
rus (HIV) microbicides for either topical or ex vivo use is 
of considerable interest, mainly due to the difficulties in 
creating a vaccine that would be active against multiple 
clades of HIV. Cyanovirin-N (CV-N), an 11-kDa protein 
from the cyanobacterium (blue-green algae) Nostoe eU 
lipsosporum with potent virucidal activity, was identi- 
fied in the search for such antiviral agents. The binding 
of CV-N to the heavily glycosylated HIV envelope pro- 
tein gpl20 is carbohydrate-dependent. Since previous 
CV-N-dimannose structures could not fully explain CV- 
N-oligomannose binding, we determined the crystal 
structures of recombinant CV-N complexed to Man-9 
and a synthetic hexamannoside, at 2.5- and 2.4-A resolu- 
tion, respectively. CV-N is a three-dimensional domain- 
swapped dimer in the crystal structures with two pri- 
mary sites near the hinge region and two secondary 
sites on the opposite ends of the dimer. The binding 
interface is constituted of three stacked Q:l-»2-Unked 
mannose rings for Man-9 and two stacked mannose 
rings for hexamannoside with the rest of the saccharide 
molecules pointing to the solution. These structures 
show xmequivocally the binding geometry of high man- 
nose sugars to CV-N, permitting a better understanding 
of carbohydrate binding to this potential new lead for 
the design of drugs against AIDS. 



Of the more than 30 million people infected with HTV^ before 
1997, 75-85% acqmred the virus through heterosexual con- 



This work was supported in part by federal funds from the NCI, 
National Institutes of Health under Contract Number N01~CO-12400. 
The costs of publication of this article were defrayed in part by the 
payment of page charges. This article must therefore be hereby marked 
"advertisement" in accordance with 18 U.S.C. Section 1734 solely to 
indicate this fact. 

The atomic coordinates and structure factors (code 1M5J, 1M5M) 
have been deposited in the Protein Data Bank, Research Collaboratory 
for Structural Bioinformatics, Rutgers University, New Brunswick, NJ 
(http: (I www. rcsb.org I). 

Supported financially by the National Institutes of Health Biotech- 
nology Training Grant. 

GlaxoSmithKline Research Scholar and an Alfred P. Sloan 
Scholar. 

D1I To whom correspondence should be addressed: NCI, National In- 
stitutes of Health, MCL Bldg. 536, Rra. 5, Frederick, MD 21702-1201. 

Tel.: 301-846-5036; Fax: 301-846-6128; E-mail: wlodawer@ncifcrf gov, 
^ The abbreviations used are: HIV, SIV, and PIV are human, simian, 
and feline immunodeficiency viruses, respectively; CV-N, cyanovirrn-N; 
gpl20. 120-kDa surface envelope glycoprotein of HIV; gp41, 41-kDa 
transmembrane subunit of HIV envelope; ITC, isothermal titration 



tacts (1); thus AIDS is likely to continue to affect the general 
population. The development of a vaccine active against mul- 
tiple clades of HIV is complicated by the high mutation rate of 
the vims (2). In the absence of vaccines, there is a growing 
interest in the development of anti-HTV vnucides for either 
topical or ex vivo use (3). A imique natural product identified in 
the search for new antiviral agents was cyanovirin-N (CV-N), 
originally isolated &om cultm*es of the cyanobacterium (blue- 
green algae) Nostoe elUpsosporum (4). 

Nanomolar concentrations of CV-N potently inactivate di- 
verse strains of HIV-1, HIV-2, SIV, and FIV (4, 5), adang at the 
level of the virus, not the target cell, to abort the infection 
process. This is achieved by preventing essential interactions 
between the envelope glycoproteia and target cell receptors 
(4-6). For mV-l, glycoproteia gpl20 is primarily involved in 
ceU entiy with approximately half of its molecular weight pro- 
vided by carbohydrates (7). Out of the 24iV-linked oligosaccha- 
rides found on its surface, 11 are high mannose or hybrid type 
(8). Studies have shown that the biadlng of CV-N to gpl20 is 
carbohydrate-dependent (4, 6, 9). Moreover, CV-N also binds 
free iV-linked oligosaccharides, having nanomolar aifinity for 
the D1D3 isomer of Man-8 and oligomannose-9 (Man-9) (10- 
12) and directly competing with gpl20 for CV-N binding. NMR 
and isothermal titration calorlmetry experiments have shown 
that the binding sites exhibit different affinities for the high 
mannose oligosaccharides (13). 

CV-N, a lOl-amino acid protein, exists in solution as either a 
compact monomer or a dimer, whereas all crystal structures 
show it exclusively as three-dimensional domain-swapped 
dimers (14—19). Three-dimensional domain swapping is an oH- 
gomerization process in which two or more protein chains ex- 
change identical domains (20). 

Monomeric CV-N consists of two similar domains with an 
overall ellipsoidal shape. Although the sequence of the protein 
is duplicated with over 60% identity between residues 1-50 and 
51—101, domain definitions based on the primary and tertiary 
structures do not coincide exactly. Sequence-defined domain A 
consists of residues 1-50, whereas residues 51-101 form do- 
main B. Each domain contains mostly /3-8trands and loops. Two 
intramolecular disulfide bonds (C8-C22, C58-C73) are impor- 
tant for the structural stability and anti-HIV activity (21). A 
change of torsion angles ia the hinge region (residues 49-54) 
separates domains A and B of CV-N into an extended form in 



calorimetry; CHES, 2-(cyclohexylamino)ethanesulfonic acid; CAPS, 
3-(cyclohexylamino)-l-propane8ulfomc acid; CAPSO, 3-(cyclohexyl- 
amino)-2-hydroxyprQpanesulfonic acid. 
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Fig. 1. Chemical structures of oligosaccliarides. Maa-9 CA) and synthetic hexamannoside WX used in these studies, are displayed with 
standard numbering and the binding interface highii^ted in green. 



which they do not contact each other, A domain-swapped dimer 
is formed hy two such ssnnmetrically related extended mono- 
mers in which domain A comes in contact witib domain B' from 
a different chain, the overall structure of each pseudo-monomer 
(AB' and A'B) heing virtually identical to the compact mono- 
mer with the exception of the hinge residues. Although the 
structures of individual pseudo-monomers resemble the com- 
pact monomer very closely, the relative orientation of the 
pseudo-monomers vary widely between different structures 
(19, 22). Crystal packing seems to thermodynamically favor the 
dimeric form, selectively trapping this form in the growing 
crystals, decreasing the abundance of the dimer in solution, 
and sliifting the equilibriinn from monomer to dimer. 

Since the antiviral activity of CV-N results from its binding 
to carbohydrates, understanding the structural basis of such 
interactions is important for the potential development of this 
protein as an anti-AIDS agent. In the present study, we report 
medium I'esolution structures of the complexes of wild-type 
CV-N with the oHgosaccharide Man-9 (Fig. lA) and with a 
synthetic hexamaimoside (Fig. IB). 

EXPERIMENTAL PROCEDURES 
High Mannose Saccharides — ^Man-9 was purchased from Glyko Inc. 
(Novate, CA). Solution phase synthesis of the branched hexamannoside 
has been described previously (23). 



Isotliermal Titration Calorimetry Experiments — ^The calorimetric ti- 
trations were performed on a VP*"**-isothermal titration calorimetry 
(ITC) titration calorimeter from Microcal, Inc. (Northampton, MA). In a 
t3rpical experiment, 10~jll1 aliquota of a CHES solution were injected 
from a 250-/i.l syringe into a rapidly mixing (300 rpm) solution of CV-N 
(cell volume, 1.3472 ml). Control experiments involved injecting iden- 
tical amounts of the CHES solution into buffer without CV-N. The 
concentration of CV-N was 0.463 mM^ and that of CHES was 6.48 mM. 
Titrations were carried out at 30 'C in 50 mM sodium phosphate buffer, 
0.2 M NaCI, 0.02% NaNg, pH 7.5. The isotherms, corrected for dilution/ 
buffer effects, were fit using the Origin ITC Analysis software according 
to the manufacturer's protocols. A nonlinear least square method was 
used to fit the titration data and to calculate the errors. From the 
binding curve values for enthalpy, stoichiometry, and binding affinity 
were extracted. Thermodynamic parameters were calculated using 
AG = -RTlnK„, AG AH - TAS. 

Protein Purification and Crystallization'^'Wild-type CV-N was 
cloned, expressed, and piu*iiied from Escherichia coli as reported 
previously (24). Only crystals grown at high pH in silica hydrogel 
(Hampton Research) were successfully utilized for preparation of com- 
plexes with oligosaccharides. These crystals were grown from droplets 
containing an equal mixture of protein and precipitant (1 M sodium 
citrate and 0.1 m CHES, pH 10.3). After 14 days at room temperature, 
the crystals grew to 0.2 x 0.15 x 0.1 ram. Crystals were soaiced for 48 h 
in 1 mM Man-9 or hexamannose immediately before data collection. 
CV-N-hexamannose co-crystals grew xmder a range of conditions under 
which the wild-type CV-N alone would not crystallize. 

Crystallographic Procedures — X-ray data for the CV-N-oligosaccha- 
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Table I 
Crystallographic statistics 



Man-9 



Hexamannoside 



Diffraction data statistics 
Cell parameters 

Space group 
Molec/asymraetric unit 
Resolution (A) 
Total reflections 
Unique reflections 
Completeness {%) 

Last shell 
Avg. 1/a 

Refinement statistics 

R-factor (%)^ 

R.m.s.d. bonds (A)'^ 
Angles (deg) 



a = b = 61.51, c ^ 147.94 
a = j3 = 7 = 90.0 
P4i2i2 
1 
2.5 
214.251 
7,967 
99.9 
100.0 
23.0 
7.1 

24.2 
29.8 

0.009 
1.6 



a = b^ 



61.35, c = 147.56 
J3 = 7 = 90.0 
P4i2i2 
1 

2.4 
129,116 
11,728 

99.8 

99.6 

12.1 

5.6 

22.4 
28.9 
0.01 
1.8 



" I^m«rgc - ~ {I)\/XI, where /is the observed intensity, and (/) is the average intensity obtained from multiple observations of symmetry-related 
reflections after rejections. 

* R-factor — 2|pi*'„| — |jF'J|/S|F^|, where and are the observed and calculated structure factors, respectively. 

Rf„„ defined in Ref. 38. 
^ R.m.s.d., root mean square deviation. 
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ride complexes were collected at 100 K on beamline X9B, National 
Synchrotron Light Source (NSLS), Brookhaven National Laboratory, 
with the ADSC Quantum4 CCD detector, at 0.92-A wavelength. A 
cryoprotectant, consisting of 80% mother liquor and 20% ethylene gly- 
col, was used. Data were processed with HKL2000 (25) (Table I). Mo- 
lecular replacement was carried out with AmoRe (26) using a compact 
monomer as a search model. With the correct solution, the sjnmmetry- 
related mates were generated, and although domain A was kept fixed, 
domain B was superimposed over the closest symmetry mate in the 
linker region. The molecular replacement could not be carried out with 
the extended molecule (Protein Data Bank code 3ezm) due to the rela- 
tive reorientation of the two domains imder different conditions. Tlie 
CNS (27) maximum likelihood refinement procedure was used, com- 
bined with simulated annealing. The model was rebuilt into density 
using O (28) with the oligosaccharide models generated with InsightH 
(Accelrys) and parameterized using XPLO-2D (29). 

RESULTS AND DISCUSSION 

The Overall Structure of CV-N — Two different crystal forms 
of CV-N have been described. Trigonal crystals grow predomi- 
nantly at low pH (4.6) (15), whereas tetragonal ci'ystals gi'ow at 
high pH (9.5-10.3) (19). The stnictures of the pseudo- 
monomers of CV-N seen in both crystal forms are very similar 
(Fig. 2A), but their relative orientation is quite different (19), 
and the intermolecular contacts are distinct. Only the tetra- 
gonal crystals could be used successfully for the determination 
of the sti-uctures of oligosaccharide complexes. 

Oligosaccharides and Their Binding — Two branched high 
mannose oligosaccharides were used in this study. Man-9 was 
derived from natural sources, and its structure coiresponds to 
an oligosaccharide that is part of the HIV-1 envelope glycopro- 
tein gpl20. The structure (Fig. lA) is made up of a reducing end 
consisting of two j3l— >4-linked iV-acetyl glucosamine residues 
that form a chitohiose unit followed by 9 mannose residues that 
form a branching, trianteimary structure. The second oligosac- 
charide (Fig. LB) is a sjncithetic structure with six mannoses 
linlced similarly to the core structure of Man-9, excluding the 
chitobiose unit on the reducing end (replaced by a pentyl group) 
and the terminal mannose units from the triantennary arms of 
Man-9. 

CV-N has been reported to bind to Man-9 with nanomolar 
(13-20 bm) afGLoity (10, 13) and to a synthetic hexamannoside 
with low micromolar (2.6 jum) afiBnity (13). There are two dis- 
tinct sugar-binding sites in a CV-N monomer, located —35 A 
away from each other: a high affinity primary site and a low 
affinity secondary site (10). These sites were mapped to the 



surface of CV-N by NMR perturbation studies even before any 
direct structural data on oligosaccharide binding became avail- 
able. Residues 41-44, 50-56, and 74-78 define the primaiy 
site, whereas residues 1-7, 22-26, and 92-95 form the second- 
ary site (10). In a domain-swapped dimer, there are four sugar- 
binding sites: two primary sites near the hinge region and two 
secondary sites on the opposite sides of the dimer, where it is 
not influenced by the conformation of the hinge region. In the 
domain-swapped dimer, the sugar-binding sites are formed by 
intertwined loops of residues belonging to both monomers. For 
example, the secondary site is created by residues 1—7 and 
22-26 from the first monomer and residues 92'-95' and 101' 
from the second monomer. 

The primary sugar-binding site consists of a deep pocket in 
the close proximity of the hinge region, which can accommodate 
an Qtl— >2-1inked dimannose molecule (30) (Fig. 2B). In our 
structures (Table I), there was no sugar bound in the primary 
site over a wide pH range, regardless of crystal soaking or 
co-crystallization with the oHgosaccharide, Instead, there was 
a well defined, tightly bound CHES molecule from the crystal- 
lization solution, partially obstructing the pocket and forming a 
strong hydrogen bond anchoring the sulfate oxygen atom 013 
to Arg-76 NHg of CV-N. Further interactions with CV-N were 
mainly hydrophobic, through the cyclohexyl ring of CHES. The 
buffer molecule did not bind the secondary sugar-binding site, 
even in the absence of any sugar. As crystals can be grown in 
other buffers in the same pH range (CAPS, CAPSO, AMP, 
glycine, and ethanolamine), it was xmlikely that the CHES 
molecxile was involved in any critical packing interactions. We 
assessed the binding affinity of CHES to CV-N using the tech- 
nique of ITC (Table II). We have used ITC previously to char- 
acterize the binding interactions of various oHgomannoses to 
CV-N (13). The binding interaction between CHES and CV-N 
was very weak (K^ value of —0.1-0.5 mw) and characterized by 
few polar/electrostatic interactions (AH value of —0.2 kcal/mol). 
This was unlike the oligomannose-CV-N interactions in which 

strong favorable binding enthalpies (AH value of 20 kcal/ 

mol) had resulted in submicromolar CV-N binding afiSnities 
(K^ 0.02-50 fjtM) for the oligomannoses (13). Given these calo- 
rimetric data, it is highly unHkely that CHES could have pre- 
vented oligosaccharide binding. 

Comparison of crystal (15, 19) and solution (14, 30) struc- 
tures of CV-N reveals the changing geometry of the primary 
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sugar-binding site upon domain swapping (Fig. 3). We can only monomer, the pocket is intact, accommodating a disaccharide 

speculate on the significance, if any, of this shift in the relative in a stacked conformation (30), whereas in all domain-swapped 

orientation of the two domains. Since the primary sugar-bind- structures, some of the essential protein-oHgosaccharide hydro- 

ing site is in close proximity to the hinge region, liie position of gen bonds cannot be established, leaving only the shape 
the hinge and relative orientation of the domains has a direct 




molecule. AB' and A'B are pseudo-monomers. Molecular surface (B) 

with the primary and secondary oligosaccharide-binding sites with Fig. 3. Primary oligosaccharide-bindLng site architectures are 
CHES bound to the former and Man-9 to the latter. Both parts of this shown in the compact monomer (A), the low pH domain- 
figure were generated with programs Bobscript (39), Raster3D (40)» and swapped structure (B), and the high pH domain-swapped struc- 
SP0CK(41). ture(C). 



Table II 

Isothermal titration calorimetry results for CHES binding 





AH 


TAS 


AG" 




7\r(CHES:CV-N) 


CHES 


kcal/mol 
0.156 ± 0.009 


kcal/mol 
5.036 ± 0.511 


kcal/mol 
-4.881 ± 0.511 


0.3 ± 0.2 


1.00 



« AG = -RTlnK„; T =^ 303 K 
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Pig. 4, Unbiased — difference electron density maps of the binding interfaces contoured at 2.0-cr level and the atomic models 
of bound oligosaccharides. A, the density for the complex with hexamannoside, visible for two stacked maimose rings: C (residue 505) and 4 
(residue 504). B, the density for the complex with Man-9, visible for three stacked raannose rings: Dl (residue 506), C (residue 505), and 4 (residue 
504). The final atomic coordinates of the oligosacdiarides and the protein atoms in contact witii them are shown for hexamannoside (O and for 
Man-9 (/)). C and D were generated with the program GRASP (42). 



complementarity of oligosaccharide and pocket. Specifically, in 
the low pH (trigonal) crystal structure (15), the sugar-binding 
pocket as seen in the compact monomeric structures (Fig. 3A) is 
partially changed. The OG oxygen of Ser-52 is displaced from 
its optimal position in which it provides a hydrogen bond to an 
oxygen atom fi'om a mannose ring, disrupting one of the im- 
portant hydrogen bonds and making sterically unfavorable the 
binding of any saccharide (Fig. 3B). This is the case in all 
trigonal crystal structures at pH 4.6-8.5, which adopt the 
same domain orientation and do not show oUgosaccharide bind- 
ing in the primary site. The high pH (tetragonal) crystal struc- 
tures have a different relative orientation of the domains but 
still show a perturbed binding pocket. In this case, Asn-53 from 
the hinge region moves farther into the pocket, reducing its 
volume (Fig. 3C) and introducing unfavorable geometry for the 
specific binding of a mannose ring. Because of these steric 
considerations, sugar binding in this pocket may not be favor- 
able and was not observed in any of the tetragonal crystal 



structures at various pH values. Since we do not observe bind- 
ing into the primary sugar-binding site, the question arises 
whether the binding of Man-9 into this pocket is also carried 
out by three stacked lings, as in the secondazy binding site, 
or by two stacked rings only, as reported previously for the 
dimannoside (30). 

The secondary sugar-binding site is located —35 A away from 
the primary site and is not affected by the geometry of the 
hinge, presenting the same conformation both in monomeric 
and domain-swapped dimeric CV-N, Comparing the two struc- 
tures, the stacked position of the rings is evident from the clear 
electron density. Three rings are seen in the case of Man-9 (Fig. 
4A), and two are seen in the case of hexamannoside (Pig. 4B). 
Inspection of the Man-9 structure obtained by molecular dy- 
namics (31) reveals that branches Dl and D3 are more acces- 
sible for extended interaction than arm D2. The three terminal 
rings from both arm Dl and arm D3 were modeled into the 
electron density and refrned. Due to the al-^ linkage, the 
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Table III 

Oligosaccharide-protein hydrogen bonds 



Maii-9 


Ring name" 


CV-N 


Distance 


504 03 


4 


B Asp-95 N 


A 

3.2 


504 04 


4 


A Thr-25 N 


2.92 


504 06 


4 


A Glu-23 0E2 


3.2 


505 OS 


C 


A Lys-3 N 


3.54 


505 03 


c 


B Asn-93 N 


2.67 


505 04 


c 


A Thr-7 N 


3]52 


505 04 


c 


A Thr-7 OGl 


2.87 


506 03 


Dl 


A Gly-2 N 


3.13 


506 03 


Dl 


A Lys-3 NZ 


3.16 


506 04 


Dl 


A Lys-3 NZ 


3.66 


Hexamannoside 








504 03 


4 


B Asp-95 N 


2.9 


504 04 


4 


B Asp-95 N 


3.03 


504 04 


4 


B Gly-96 N 


3.62 


504 04 


4 


ATlir-25N 


3.57 


504 06 


4 


A Glu-23 OE2 


3.16 


505 03 


C 


B Asn-93 N 


2.79 


505 03 


C 


A Lys-3 N 


3.4 


605 04 


C 


A Thr-7 N 


3.63 


505 04 


C 


AThr-.7 OGl 


3.09 



" According to nomenclature from Fig. 1. 

geometry of arm D3 is not compatible with the observed elec- 
tron density; only mannose rings in an al— >2-linked conforma- 
tion can fit into the Man-9 electron density map. This observa- 
tion is in agi'eement with previous results that show that the 
oligosaccharides interface with CV-N via a branch containing 
mainly a£l-^2-linlted rings. The position of mannose rin^ C 
and 4 (residues 505 and 504) from arm Dl is very similar to the 
model suggested from the solution structure of a CV-N disac- 
charide complex (30). However, the stacked rings interact 
much more tightly with the protein than suggested by the NMR 
model, making 10 hydrogen bonds for the Man-9 and 9 hydro- 
gen bonds for the hexamannoside binding interface (Table III). 
Seven of these hydrogen bonds are conserved between the 
binding rings 4 and C of the two oligosaccharides, but they are 
not comparable with the dimannoside hydrogen bonds. Man- 
nose ling 4 of the hexamannoside with five sugar-protein hy- 
drogen-bonds the most tightly. However, the higher number of 
hydrogen bonds for rings 4 and C of hexamannoside does not 
provide extra affinity as compared with Man-9. In this respect, 
the terminal ring Dl (residue 506) of Man-9 must be responsi- 
ble for the orders of magnitude higher affinity of this hgand. 
The Dl ring forms three hydrogen bonds with Gly-2 and Lys-3 
and could also interact with the flexible Glu-101 from domain 
B. These important interactions of the terminal ring Dl were 
not accounted for in a recent mutagenesis study of the second- 
aiy sugai'-binding site using a two-ring binding interface model 
(32). The branching mannose ring (residue 503) can also be 
identified in both maps, suggesting that the q:1— >2-linLked 
stacked lings ai*e of the main Dl arm. Since the only difference 
between the Dl branches in Man-9 and hexamannoside is the 
lack of one al— >2-linlced mannose ling (residue 506) at the end 
of the branch in the latter, the density unequivocally identifies 
the Dl arm as the binding interface. 

The electi-on density was very clear for the binding interface 
(residues 504-506) and the branching ring 3 (residue 503) but 
is less well defined for the rest of the sugar molecule, which is 
pointing away fi'om the protein. The well defined rings (resi- 
dues 504-506) were modeled and refined first. These mannopy- 
ranose rings adopt the chair conformation with the nonreduc- 
ing pyranose ring stacked over the reducing mannopyranose. 
RMH and computational studies show that the conformations 
of disaccharide fragments of an oligosaccharide fall within the 
allowed regions of the corresponding disaccharide 9 and iff 



Table IV 
Oligosaccharide torsion angles for al ■ 



*■ 2 linkages 



Linkage between 
rings 


OligoBacchEiride 










degrees 


degrees 




Man-9 


-4.5 


22.7 




hexamannoside 


-3.6 


10.7 


D1->C 


Man-9 


-39.4 


45.9 


2^1 


dimannoside" 


-48.8 


36.9 


2-^1 


methyl-dimann oside* 


-55.5 


14.5 



" From CV-N/dimannoside solution structure (liiy.pdb) (30). 
* From methyl-diraannoside ciystal structure (34). 



angle maps (33). If the pyranose rings are treated as flexible 
during the energy minimization studies of disaccharides, then 
the ip and ij> conformational space accessed is larger than when 
the pyranose rings are treated as rigid (33). This seems to be 
the case of our structures, in which <p and i/f values show more 
variation, possibly because of the flexibility of the mannopy- 
ranose rings. Comparison of the torsion angles for the al->2 
linkages present in the binding interface (Table IV) shows a 
rather good agreement of the values of <p and t/rfor the terminal 
Dl -* C ring linkage in Man-9 with the dimannoside solution 
structure and the methyl-dimannoside crystal structure (34). 
The values of tp and i/r are in good agreement for the C 4 ring 
linkage in Man-9 and hexamannoside. However", tp shows a 
35-50° deviation, and ^ a 5-35" deviation, when compared with 
the Dl -» C linkage in Man-9 and the dimannoside solution 
structure. Since ring Dl is absent in the hexamannoside, this 
difference in the torsion angles could be attributed to the steric 
constraints of the flanking mannopyranoside ring 3 (residue 
503) and those of the protein-binding site. The flexible D2 and 
D3 arms, which point to the solution, can adopt multiple con- 
formations without steric strain on the binding interface. Of 
the two identical secondary binding sites, one had better elec- 
tron density. In this site, the boxmd Man-9 molecule reaches 
over to a symmetry-related CV-N molecule, contacting residues 
15, 59, and 61. These additional interactions might explain the 
better anchoiing of the Man-9 molecule in this site and there- 
fore its better electron density. In the other site, only the Dl 
arm binding rings and the branching Man-3 were modeled in 
and refined. 

The GlcNAc ring 1 (residue 501) firom Man-9 is pointing into 
the solvent, its most remote atom being located —14 A firom the 
CV-N surface, Since this distance was measured in the most 
extended conformation of Man-9 bound to the secondary bind- 
ing site, closer positioning of the GlcNAc ring might be possible 
given the flexibility of the two intermediate rings. GlcNAc 1 is 
anchored by an Asn residue on the smface of gpl20, positioning 
bound gpl20 much closer to CV-N than suggested previously 
(10), Therefore, besides oligosaccharide-mediated interactions, 
discrete protein-protein interactions might also play a role in 
the gpl20-CV-N binding (12). 

Several co-crystaUization conditions were identified for 
CV-N with hexamannoside. However, the co-crystal structures 
did not show any sugar binding. This lack of hexamannoside 
binding to the secondary binding site of CV-N is similar to the 
results of solution phase NMR experiments in which binding of 
the hexamannoside to the secondary binding site was not de- 
tected (13). Also, soaks with a sjmthetic nonamannoside and 
trimannoside did not jdeld binding in their respective crystal 
structures. In both synthetic oligosaccharides, a short hydro- 
phobic pentyl segment substituted GlcNAc 1 and 2. Nonama- 
nnoside had a tendency to disorder the crystals, raising the 
mosaicity significantly. The Man-9-CV-N complex structure 
shows that the branching Man-3 is already protruding firom the 
binding site, interacting partially with the flexible loop region 



3 



o 

M 



Attachment C 



34342 



Cyanovirin-N- Sugar Complexes 



I 

6 



1 



(residues 22-27). The position of the pentyl is further restricted 
in nonamannoside by the bulk of the longer branches and 
might not interact favorably with the flexible loop region, low- 
ering considerably the binding affinity of the sugar and at the 
same time having a chaotropic effect on the ordered molecules 
in the crystal. 

CV-N undergoes conformational changes upon binding of 
gpl20 and gp41 (6, 9, 35) with an average 11% loss of j3-sheet 
and 2% loss of helical structure. We attempted to determine 
whether this reflects in the flexibihty of the loop in the uncom- 
plexed and sugar-bound forms of CV-N by comparing the B- 
factors of these stnictures. There is a well conserved trend of 
higher than average B-factors in the regions with residues 
20-30, 50-57, and 70-80 (data not shown). Not suiprisingly, 
these are residues directly involved in the architecture of 
sugar-binding sites. This suggests a flexible pocket architec- 
ture, which is able to accoramodate two or three stacked rings, 
depending on the structure of the oligosaccharide. 

Implications of the Structures for gpl20 Binding — ^The crys- 
taUographic data presented here give visual definition to the 
molecular basis of the unique specificity of CV-N for Man-8 and 
Man-9 moieties. The additional binding afiSnity of CV-N for 
Man-8 and Man-9 as compared with Man-6 or hexamannoside 
is the result of the additional blading energy derived from the 
third mannose ring. The physiological significance of this dif- 
ference is manifested in the relative paucity of Man-8 and 
Man-9 residues on normal mammalian glycoproteins. Man-8 
and Man-9 oligosaccharides are rarely found in the human 
system and then usually only on glycoproteins destined for 
short term use and rapid degradation {e.g. tissue plasminogen 
activator). In the case of certain viral glycoproteins, however, 
Man-8 and Man-9 oligosaccharides are more prevalent. Such is 
the case of HTV gpl20, where as many as 11 high mannose 
oligosaccharides can be present with five to six of these gener- 
ally in the form of Man-8 or Man-9 (36). It has been shown that 
CV-N inhibits the binding of the broadly neutralizing anti- 
HIV-1 antibody 2G12 (6). Incubation of gpl20 with excess 
GV-N prevents subsequent binding of 2G12 to gpl20; however, 
CV-N binding to gpl20 that was preincubated with excess 
2G12 was only slightly reduced as compai-ed with binding to 
gpl20 alone (37). These data suggest that 2G12 binds only a 
subset of the Manla-^2Man termini present on gpl20, 
whereas CV-N binds essentially all such residues (37). The 
abiUty of CV-N to target these virus-associated oligosaccha- 
rides with high affinity, whereas only binding with relatively 
low affinity to other, more common, mammalian oligosaccha- 
lides {,e.g, Man-6) is the basis for the potential utility of this 
agent as an anti-HIV microbicide. 
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Solution structure of cyanovirin-N, a potent 

HIV-inactivating protein 

Carole A. Bewley^ Kirk R. Gustafson^, Michael R. Boyd^, David G. CovelP, Ad Bax\ G. Marius Clore^ 
and Angela M. Gronenborn^ 

The solution structure of cyanovlrm-N, a potent 11,000 Mr HIV-lnactivating protein that binds with high 
affinity and specificity to the HIV surface envelope protein gp120, has been solved by nuclear magnetic 
resonance spectroscopy. Including extensive use of dipolar couplings which provide a prion long range 
structural information. Cyanovlrin-N is an elongated, largely JJ-sheet protein that drsplays internal two-fold 
pseudosymmetry. The two sequence repeats (residues 1-50 and 51-101) share 32% sequence identity and 
superimpose with a backbone atomic root-mean-square difference of 13 A. The two repeats, however, do not 
form separate domains since the overall fold Is dependent on numerous contacts between them. Rather, two 
symmetrically related domains are formed by strand exchange between the two repeats. Analysis of surface 
hydrophobic clusters suggests the location of potential binding sites for protein-protein interactions. 



The initial events that lead to HIV infection include binding of 
the virus to the host cell, activation of the virus", and ultimately 
virus-cell or cell-cell fusion*. During tlie first step of HIV infec- 
tion, the viral surface envelope glycoprotein gpl20 interacts 
with the CD4 receptor of the host cell, upon which gpl20 
undergoes a conformational change^ sufficient to accommo- 
date a subsequent interaction between gpl20 and a member of 
the a and p chemokine receptor families, now commonly 
referred to as coreceptors-'^. Concurrently, gp41 dissociates 
from gpl20, associates with the target memlsrane and mediates 
fusion. In this paper we describe the three-dimensional solu- 
tion structure of a newly discovered cyanobacterial protein, 
named cyanovirin-N, which is a highly potent inhibitor of 
diverse laboratory adapted strains and clinical isolates of 
HlV-1, as well as HlV-2 and S1V\ The antiviral activity of 
cyanovirin-N is mediated, at least in part, through high affinity 
bhiding to gpl20*'^, Cyanovirin-N is currently under joint 
NCI/NTAID investigation as a broad-spectrum virucidal and 
tiierapeutic agent against HIV. 

Cyanovirin-N was originally isolated from an aqueous 
extract of a cultured cyanobacterium, Nostoc ellipsosporum'^:, 
and was identified in a screening effort aimed at the discovery 
of new sources of HIV holiibitors^. The primary sequence and 
disulfide bonding pattern were determined by conventional 
biochemical techniques'*^, and a synthetic gene was construct- 
ed for over-expression of the protein'^**. Analysis of the primary 
sequence of cyanovirin-N revealed the presence of two internal 
repeats of 50 and 51 amino acids that show strong sequence 
similarity to one another, and equivalent positions of the disul- 
fide bonds (Fig, 1)'. Cyanovirin-N is extremely resistant to 
physico-chemical degradation and can withstand treatment 
with denaturants» detergents, organic solvents such as acetoni- 
trile or methanol, multiple freeze-thaw cycles, and heat (up to 
100 **C) with no subsequent loss of antiviral activity*. The pri- 
mary sequence of cyanovirin-N shares no similarity witli other 
proteins thus far deposited in public protein data bases. 



Structure determination 

The solution structure of cyanovirin-N was determined using 
double and triple resonance multidimensional heteronuclear 
NMR spectroscopy, making use of uniformly ^^U- and ^^N/^^C- 
labeled protein^"^^ The final ensemble of structures was calcu- 
lated by simulated annealing'^ on the basis of 2,509 
experimental NMR restraints, including 334 residual dipolar 
couplings (^Dhii, ^Dch, 'I>cac'> ^Dnc and -DijuvcO - The latter are 
a function of the orientation of interatomic vectors relative to 
the molecular alignment tensor, and hence provide qualita- 
tively different structural information from that afforded by 
other NMR observables, such as NOEs, couphng constants 
and chemical shifts, which are reliant on close spatial proximi- 
ty of atoms'^. A summary of the structural statistics is provid- 
ed in Table 1, and a superposition of the final ensemble of 40 
simulated annealing structures is shown in Fig. 2a, It should 
be noted that the inclusion of the residual dipolar couplings in 
the structure refinement increases the coordinate precision 
from 0.3 A to 0.15 A for the backbone atoms and from 0.54 A 
to 0.45 A for all heavy atoms. The atomic r.m.s. shift in the 
mean coordinates resulting from the inclusion of the dipolar 
coupHngs is 0.66 A for the backbone atoms and 0.81 A for all 
heavy atoms. 

Description of the structure 

Cyanovirin-N has the shape of an elongated prolate eUipsoid, 
-55- A in length with a maximum width of -25 A. The sec- 
ondary structure elements comprise 10 p-strands and four 
short 3jo-helical turns. The overall topology can be described 
as follows (Figs la, 2^). in the first sequence repeat (residues 
1-50) a four-residue helical turn (residues 3-6) precedes a 
three-stranded antiparallel p~sheet (pl> p2 and P3 from 
residues 7-14, 16-24 and 28-36 respectively). P-strand 1 is 
characterized by a wide P-bulge^^ at Asn 10 and Ser 11. A 
three-residue helical turn (residues 37-39) connects p-strand 
3 to a P-hairpin formed by P-strands 4 (residues 39-43) and 5 
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Fig. 1 Secondary structure elements and structure alignment of cyanovirin-N. a. Secondary structure and global topology of cyanovirin. Boxed 
residues denote p-strands, circled residues 3|o-heIicaI turns, solid arrows indicate backbone hydrogen bonds from amide protons to carbonyl oxygens, 
the dotted arrow Indicates hydrogen bonds bridged by a water molecule, and offset boxes denote the positions of p-bulges. The dotted line demar- 
cates the first and second sequence repeats, h. Structure alignment of the first and second sequence repeats. Yellow boxes denote conserved 
residues, red boxes identical residues; blade and red lines demarcate domains A and B respectively. 



(residues 46-50). The p-liairpin is directed away from the 
three-stranded antiparallel p-sheet at an angle of -140** (Figs 
2ib> 3fl). This topology is exactly repeated in the second 
sequence repeat (residues 51-101) wherein the second triple- 
stranded antiparallel p-sheet is formed by P-strands 6 
(residues 57-64 with a wide p-bulge at Asn 60 and Thr 61), 7 
(residues 67-75) and 8 (residues 79-87), and the second P- 
hairpin by P-strands 9 (residues 91-94) and 10 (residues 
97-100). The first and second sequence repeats are oriented 
opposite to one another with respect to the pseudosymmetric 
two-fold axis, which is directed into the plane of the paper in 
the view shown in Fig. 2fc', such that P-strands 1 and 6 lie adja- 
cent and are oriented antiparallel to one another. This 
arrangement places each of the p-hairpins on top of the triple 
stranded p-sheet of the other half of the molecule with an 
angle of -40** between the p-hairpin and the central strand of 
the underlying p-sheet. 

Although cyanovirin-N can be clearly divided into two 
sequential sequence repeats, the individual repeats do not 
form separate domahis (Fig. 3aj7), Rather, the overall fold of 



the molecule depends on numerous contacts between the two 
repeats. Indeed, the interaction between the two repeats of 
cyanovirin-N buries 3,085 of accessible surface area 
(1,551 A- for the first repeat and 1,534 A^ for the second). In 
addition, there are several electrostatic interactions between 
the first and second repeats which include hydrogen bonds 
between Gly I5(NH) and Thr 61(Oy), Asn 37(N5H2) and 
Asn 53(05), Asp 44(08) and both Arg 76(NH) and 
Arg 76(Ne), and Leu 47(NH) and Thr 83(Oy). 

The overall structure of cyanovirin-N, however, can be 
divided into two symmetrically related domains, A and B, 
formed by strand exchange between the two sequence repeats 
(Fig. 3c). Domain A contains the N- and C- termini and com- 
prises residues 1-39 and 90-101; domain B extends from 
residues 39 to 90. Thus, domains A and B correspond to the 
*top' and 'bottom* halves of the molecule in the views shown in 
Fig. 3a,c. Each domain contains the triple-stranded antiparal- 
lel p-sheet of one repeat and the P-hairpin of the other repeat, 
and the two domains are joined together by helical turns 2 
(residues 37-39) and 4 (residues 88-90) (Fig, 3c). 
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Fig. 2 Overall structure of cyanovirin-N. a, Stereoview showing superpo- 
sitions of the ensemble of the final 40 simulated annealing structures of 
cyanovlrin-N. The backbone is shown in red, the disulfide bridges in 
green, and all other side chains in blue, b. Ribbon drawing showing 
cyanovirin-N from the 'front' of the molecule, p-strands are displayed in 
blue and helical turns in red, and the location of the two-fold axis of 
pseudosymmetry Is shown as a black dot. 



The positioning of the p-hairpins with respect to the underly- contacts and a hydrogen bond between the NeH atom of Trp 49 
ing triple-stranded (S-sheets is determined by numerous and the 08 atom ofAsp 89, contribute to the closing of this gap. 
hydrophobic interactions. In domain A, Phe 4, Leu 18> lie 34 and Given the strong internal sequence similarity of the two 
Leu 36 located in the p-sheet of the first repeat interact with Leu repeats of cyanovirin-N, the nearly identical structures 
87, lie 91 and Leu 98 which are part of the P-hairpin of the sec- observed for the two repeats is not surprising. A structural 
. ond repeat (Fig. Aa). Note diat leucines 36 and 87 are located alignment between the first and second repeats show 16 
close to the two-fold axis and assume inverted positions in the residues to be identical and 19 residues to be conservatively 

packing of domain B (Fig, 4h), The disulfide bridge between Cys replaced giving rise to an overall 70% similarity (Fig. 1^;), and 

8 and Cys 22 has a right-handed hook conformation, and the Sy the two repeats can be superimposed with a Ca atomic r.m-s. 

atoms are m van derWaalscontactwith side chains of Phe 4, Asn difference of 1.3 A for all 50 residues (Fig. Sh). Likewise, 

93 and Leu 98. A hydrogen bond between the NSHo group of Asn domains A and B can be superimposed with a Ca atomic r.m.s. 

93 and the Oyatom of Thr 7 bridges P-strands 9 and 1. Finally, a difference of 1.3 A (Fig. M), A noteworthy feature that results 

tightly bound water molecule, identified by the observation of from the internal two-fold pseudosymmetry is the adjacent 

ROEs from water to the amide protons of Ala 92, Lys 99 and Glu placement of the N- and C-termini, wherein the side chains of 

101, serves to bridge hydrogen bonds between the backbone car- Leu 1 and Glu 101 are in van der Waals contact. Thus Leu 1 

bonyls of His 90 and Lys 99 and the backbone amide of Ala 92. and Glu 101 form a noncovalent bridge over P-strand 9 that 

Fig. Ab shows domain B viewed in the same orientation as mimics the loop connecting P-strand 5 and helical turn 3, 

domain A displayed in Fig. 4a. A nearly identical set of interac- which similarly crosses over p-strand 4 (Fig. lb) , This close 

tions is apparent: namely, the equivalently positioned side arrangement of the N- and C-termini may help to explain 

chains of Phe 54, Leu 69, He 85 and Leu 87 of the second triple- results from mutagenesis studies that showed that removal of 

stranded P-sheet pack against Leu 36, lie 40 and Leu 47 of the three consecutive residues from either the N- or C-terminus 

first P-hairpin. Note that the side chain conformation of He 85 reduces the antiviral activity by over two orders of magnitude^, 
with %j/X2 angles in the tit ro tamers differs from the equivalent 

He 34 which has Xj/%2 angles in the g-/t rotamers, presumably Similarities to other proteins 

due to the presence of the tightly bound water wedged between The function of cyanovirin-N in the cyanobacterium from 
P-strands 9 and 1 0. The disulfide bridge between Cys 58 and Cys which it was isolated remains unknown. An automated search 
73 occupies the same position as that between Cys 8 and Cys 22, of protein and gene sequence data banks^^ failed to return any 
with the Sy atoms in close contact with Phe 54, Asn 42 and Leu sequences similar to cyanovirin-N. Similarly, a search of the 
47; and a hydrogen bond occurs between the side chains of Asn Brookhaven protein structure database using the program 
42 (in P-strand 4) and Thr 57 (in p-strand 6). In addition, the DALI**^ did not reveal any proteins with significant overall 
side chains of Thr 61 and Ala 71, which are in van der Waals structural similarity (Z score > 3) to cyanovirin-N. The two 
contact with one another, pack against the aromatic ring of Phe individual domains, however, bear a distant resemblance to 
54 providing an auxiliary element to the core not seen in the SH3 fold'^ (Fig, 5). The closest match is with the hyper- 
domain A where two serines (at positions 1 1 and 20) are substi- thermophile DNA binding protein Sac7d'^ for which 41 
tuted for Thr 61 and Ala 71. residues can be superimposed onto each domain of 
At the interface of the two domains (Fig. 4c), a cluster of cyanovirin-N with a Ca atomic r.m.s. difference of -3 A. The 
hydrophobic residues, comprising Val 39, His 90, Trp 49 and Tyr match comprises the triple stranded antiparallel P-sheet and 
100, brings together helical turns 2 and 4 and the- tips of P- the two helical turns within each domain. The P-hairpin, how- 
strands 5 and 10 respectively (Fig. 4c). Finally, the mteractions ever, in the domains of cyanovirin-N, is flipped by -180° 
between Trp 49 and Asp 89, which include both hydrophobic about its long axis, relative to that in Sac7d and other SH3 
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Fig 3 Internal two-fold symmetry of cyanovlrin-N. a. Ribbon diagram wrth the two sequen- 
tial repeats colored green (residues 1-50) and red (residues 51-101). b Best-fit superposi- 
tion of the backbone (N, Ca, C) atoms of the two sequential repeats colored as in (a), c. 
Ribbon diagram with domains A (residues 1-39 and 90-101) and B (residues 3&-90) colored 
blue and gold, respectively, cf, Best-fit superposition of the backbone (N, Co, C) atoms of 
domains A and B colored as in (c). 



domains. It should be noted that in SH3 domains of signaling 
proteins the equivalent structural element to this hairpin is 
extended into a long> so-called RT loop which forms part of 
the binding site for poiyproline containing peptides^'. 

Potential binding surfaces of cyanovirm-N 

Several observations in regard to the antiviral activity of 
cyanovirin-N suggested early on that its abiUty to inactivate 
diverse strains of HIV may be a result of interactions between 
cyanovirin-N and the HIV envelope glycoprotein gpl20*'^. First, 
pretreatment of virus with cyanovirin-N prevented virus-ceil 
fusion and infection of cellSj while pretreatment of cells with 
cyanovirin-N offered no protection. Second, delayed addition 
experiments revealed that cyanovirin-N had to be added to cells 
before or shortly after addition of virus to afford maximum 
antiviral activity. And third, cyanovirin-N inactivated diverse 
laboratory strains and clinical isolates of HIV- 1, HTV-2 and SIV, 
all of which share similar surface envelope glycoprotein func- 
tions. In the case of HIV-I, the clinical isolates included M-trop- 
ic, T-tropic and dual-tropic strains, all of which were inhibited at 
comparable low nanomolar concentrations^ Collectively these 
results suggested that cyanovirin-N inhibits fusion and viral 
transmission by direct interactions with the virus as opposed to 



the target cell and any of its 
receptors (such as CD4, 
CXCR4, CCR5). Through a 
variety of experimental 
approaches, cyanovirin-N 
was shown to bind avidly 
to gpl20, including re- 
combinant non-glycosylated 
gpl20^. Further, pretreat- 
ment of cyanovirin-N 
with exogenous, virus-free 
gpl20 resulted in a concen- 
tration-dependent decrease 
in antiviral activity^. The 
recombinant cyanovirin-N 
used in the NMR structural 
studies had gpl20 binding 
and anti-HIV properties that 
were indistinguishable from 
those of cyanovirin-N isolat- 
ed from its natural source''. 

Since surface hydropho- 
bicity plays a key role 
in protein-protein interac- 
tions^^^^, we have mapped 
the most hydrophobic sur- 
face clusters on cyanovirin- 
N using the method of 
Covell and coworkers-"--- to 
predict which regions of 
cyanovirin-N may be inter- 
acting with the viral enve- 
lope. Fig. 6 shows two views 
of a surface representation 
of cyanovirin-N onto which 
have been mapped the elec- 
trostatic potential (left- 
hand panels) and the two 
highest ranking hydropho- 
bic clusters (center panels). 
The top ranking surface hydrophobic cluster (Fig. 6a) is locat- 
ed in domain A, comprises Leu 1, Gly 2, Lys 3, Gin 6, Thr 7, 
Thr 25, Asn 26, Asn 93, He 94 and Asp 95, and is centered 
around the N-terminus and the turns connecting P-strands 2 
and 3 and p-strands 9 and 10. This cluster of amino acids 
forms an extensive and curved ridge that surrounds the cleft 
between the first triple-stranded P-sheet and the second P- 
hairpin (Fig. 6fl, right). Comparison of the hydrophobic clus- 
ter with the electrostatic surface shows that this ridge, as well 
as the cleft, is predominantly neutral, with the exception of the 
positive charge from the N^Hj group of Lys 3 and the negative 
charge from the carboxylate of Asp 95 located at the top right 
and bottom left corners respectively, of the hydrophobic clus- 
ter. The observation that a large portion of this highest rank- 
ing hydrophobic cluster consists of the first three N-terminai 
residues provides a rational for the finding tliat a cyanovirin-N 
mutant lacking the first three N-terminal residues is virtually 
inactive^. It should be noted that the symmetrically equivalent 
surface region in domain B of cyanovirin-N, which would 
consist of helical turn 3 and the turns between P-strands 4 and 
5 and P-strands 7 and 8 (surface riot shown) carries a formal 
charge of -2 wherein the neutral residues Gin 6, Thr 26 and Ala 
92 (not labeled) are substituted by Glu 56, Arg 76 and Glu 42 
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respectively, and Lys3 is replaced by Asn53, resulting in a large 
decrease in surface liydrophobicity and a very different charge 
distribution. 

The second highest ranking surface hydrophobic cluster is 
located in domain B, connprises Ala 64, Gly 65, Ser 66, GIu 68, 
Ala 70, Lys 84 and Asn 86, spans regions of p-strands 6, 7 and 8 
and includes the turn between P-strands 6 and 7 (Fig. 6b). Ala 
70, Ala 64, Gly 65 and Ser 66 form a neutral vertical ridge while 
Lys 84 and Glu 68 present positive and negative charges respec- 
tively, at the left edge of the region (Fig. 6b). 

It is known that cyanovirin-N does not bind to gpl20 at either 
of the V3 loop or the CD4 binding site" regions of gpI20, which 
have been well characterized. Tlius, future structural studies of 
cyanovirin-N complexed to tlie relevant, folded domain of 
gpl20 should shed light on the molecular mechanisms of the 
conformational changes necessary for initiation of the fusion 
event, and may also provide more structural dues concerning 
the interactions between gpl20 and gp4L 

Methods 

Expression, purification and sample preparation. Cloning'and 
expression of a synthetic gene for cyanovirrn-N has been described 
elsewhere^. Briefly, the appropriate DMA coding sequence was sufa- 
cloned into the E co// vector pFLAG, followed by transformation of 
£ CO// strain BL21. Uniform (>95%) ^^^j Qr)d "c /abeling was 
obtained by growing the cells in modified minimal medium contain- 
ing 'sjMH^ci and/or i^Cg-glucose as the sole nitrogen and carbon 
sources respectively Cells were grown at 37 and protein expres- 
sion was induced for three hours with 1 mM isopropyl-D-thiogalac- 
toside. The cells were harvested, resuspended in 10 mM Tris buffer, 
pH 7.4, 5 mM EDTA, and 5 mM benzamidme, lysed by passage 
through a French press, and cleared by centrifugation. The super- 
natant was appHed direaly to a preparative Bakerbond C4 wide 
pore column and eluted in a stepwise manner with water, 2;1 v/v 
watermethanol, 1;2 v/v water-methanol, and methanol. Fractions 
containing cyanovirin-N were combined and further purified by 
reversed-phase HPLC using a C18 column equilibrated with 0.05% 
trtfluoroacetic acid and eluted with a linear gradient of 20% to 40% 
acetonitrile. Based on amino acid analysis and UV spectroscopy, the 
extinction coefficient of cyanovirin-N is 9,400 mol-^ cm-^ at ZSOnm. 
Samples for NMR contained -1.4 mM protein at pH 6.1. 

NIWR spectroscopy. NMR experiments were carried out at 27 ''c on 
Bruker DMX500, DMX600 and DMX750 spectrometers equipped 
with x,y,2-shieldGd gradient triple resonance probes. Spectra were 
processed with the NMRPIpe package^^, and analyzed using the pro- 
grams PIPP and STAPP2^ 1H, and ^^c resonance assignments were 
made from the following 3D through-bond heteronuclear correla- 
tion experiments: CBCA{CO)NH, CBCANH, HNCO, HNHA, DIPSI- 
H(CCO)NH, DIPSl-acCO)NH, HCCH-COSYand HCCH-TOCSY^-^^ 3J„^,„, 
^Jc*c7 (aromatic, methyl and methylene), ^Jncy {aromatic, methyl and 
methylene), and Vcc couplings were measured by quantitative J 
correlation spectroscopy's. <j, and ij/ backbone torsion angle 
restraints were derived from ^J^ua and ^Jcoco coupling constants, the 
three-bond amide deuterium isotope effect on the ^aCa shifts {mea- 
sured from a 3D HCA{CO)N recorded on a protein sample dissolved 
in 50% H2O/50% D20)2fi, and the backbone (m, NH, "Ca, "cp, i3C, 
Hot) secondary chemical shifts using the program TAIOS (G, 
Cornilescu, R Delagllo and A.B., in preparation). The latter compris- 
es a database of residue triplets correlating <t)/y angles (derived from 
high resolution structures) and their corresponding backbone sec- 
ondary chemical shifts, and is based on the premise that when a 
string of three amino acid shows high similarity in secondary shifts 
and residue type with a string of amino acids in the database, the 
central residues of the two strings are likely to have similar back- 
bone torsion angles. Side chain torsion angles were derived from 
NOE/ROE data and three-bond heteronuclear coupling constants. 
Interproton distance restraints were derived from 3D ^sfM (120 ms 
nniJcing time) and (45 and 120 ms mixing time) separated NOE 
experiments, and 40 "C/isN (140 ms mixing time) and "Q^c (140 ms 






Fig. 4 Side chain contacts forming the core of cyanovirin-N Views 
showing the core of a, domain A and domain B. Domain B is shown 
m the same orientation as domain A. c. The Interface of the two 
cfomams. A Ca backbone worm is shown in blue, hydrophobic residues 
in red, aromatic residues in green, disulfide bonds In yellow, and all 
other residues in magenta The tightly bound water In domain A which 
serves to bridge backbone hydrogen bonds between |3-strands 9 and 10 
IS shown as a cyan colored sphere. 
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mixing time) separated NOE experiments^-". Location of bound 
water was determined by means of a 2D H20-RO£-"'H-'5r\(-H5QC 
(mixing time 45 ms) spectrum^^ tp^,, ipc^n, 'Dc„cv ^D^c and ^Dhnc 
residual dipolar coupJings were obtained by taking the difference in 
the corresponding J splittings measured on oriented (in 4% 3-1 
DMPC:DHPC at 38 "^C) and isotropic (in water) cyanovirin-Nz^. ^J„h, 
JcaH* ^coc couplings were obtained from a 2D IPAP 05N,^H3-H5QC 
experiment to generate two spectra containing either the upf leld 
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or downfleld ^^N doublet component^^ a 3D ^Ha-coupled(F,) 
HCA(CO)N experiment, and a 2D constant time ^^C'-coupled/^Ha- 
decoupled<F^) ^H-"C HSQC experiment respectively- U„c and ^Jhnc 
couplings were obtained from a 2D ''3C-coupled/"Ca-decoup(ed(Fi) 
^H-isN HSQC experlment^o. The precision of the measured ^Dnh^ 
^DcoH/ ^Dcac/ 'Dnc and 2D„kc' dipolar couplings was -0.5-1.0 Hz, 
-1-1.5 Hz, -1.0-1.5 Hz, -0.5-1.0 Hz and -1.0-1.5 Hz respectively. 




Fig. 5 Comparison of the fold of domains a, A and b, B of 
cyanovirin-N with c, the hype rthermoph lie DNA binding protein 
Sac7d" and d, the SH3 domain of spectrin*? 



The measured ^Dnh values ranged from -31 to +20 Hz, and tlie 
normalization factors (given by yNyH<rm~^>/7AYB<fAB'^> where y 
and r represent gyromagnetic ratios and distances respectively) 
employed for ^DcaHr ^Dctzc. ^Dnc and ^Dhnc relative to ^D^h were 
0.48, 5.36, 9.04 and 3.04 respectively. The magnitude of the 
axial and rhombic components of the alignment tensor O^^ 
were obtained by examining the distribution of normalized 
dipolar couplings^^ which yielded values of Dg*^" = -17.0 Hz and 
R = 0.1 7, where 03"^^ is the axial component of the tensor and R 
is the rhombicity defined as the ratio of the rhombic to axia! 
components of the tensor. This value of Da"^*^ corresponds to a 
value of 1 .48 X 1 0-^ for A3 which is the unitless axial component 
of the molecular alignment tensor A. Heteronuclear ^^N-OH) 
NOEs were measured as described^^ and identified only a single 
residue with a '5N'{ih} NOE less than 0.6, namely Ser 52 which 
had an NOE value of -0.4. 

Structure calcuiations. Approximate inter proton distance 
restraints, derived from multidimensional NOE spectra, were 
grouped into four distance ranges, 1.8-2.7 A (1.8-2.9 A for NOEs 
involving NH protons), 1.8-3.3 A (1.8-3.5 A for NOEs involving NH 
protons), 1.8-5.0 and 1.8-5.0 A, corresponding to strong, medi- 
um, weak and very weak NOEs respectively^. 0.5 A was added to 
the upper bound for distances involving methyl groups to account for 
the higher apparent intensity of the methyl resonances. Distances 
involving non-stereospecifically assigned methylene protons, methyl 
groups, and H5 and He protons of Tyr and Phe, were represented as a 
(2r6)-v6 sum33 The structures were calculated by simulated anneal- 
ingi2,34 using the program CNS^s, adapted to incorporate pseudopo- 
tentfafs for three-bond coupling constants^^ secondary ^Ka/^K^ 
chemical shifts^?, proton chemical shifts^e-as and residual 
dipolar couplrng''° restraints, and a conformational data- 
base potential for the non-bonded contacts derived from 
very high resolution (1.7 A or better) X-ray structures^^-^ 
The target function that is minimized during simulated 
annealing and restrained regularization comprises quadrat- 
ic harmonic potential terms for covalent geometry, ^Smm 
coupling constant restraints, secondary «Ca and "cp chem- 
ical shift restraints, ^H chemical shift restraints, and dipolar 
coupling restraints; square-well quadratic potentials for the 
experimental distance and torsion angle restraints; a quar- 
tic van der Waals repulsion term and a conformational 
database potential term for the non-bonded contacts. The 
latter biases sampling during simulated annealing refine- 
ment to conformations that are likely to be energetically 
possible by effectively limiting the choices of dihedral 
angles to those that are known to be physically realiz- 
able^''^2 There were no hydrogen-bonding, electrostatic, or 
6-12 Lennard-Jones empirical potential energy terms in the 
target function. 

Structure figures were generated using the programs 
MOLMOL43, GRASPw and RIBBONS-'s. The secondary struc- 



Ffg 6 a,b Two views mapping the electrostatic potential 
({eft-hand panels) and the two highest ranking surface 
hydrophobic clusters (center panels) on the molecular sur- 
face of cyanovlrin-N. The first (a) and second {b) highest 
ranking hydrophobic clusters are located in domains A and B 
respectively. In the left-hand panels, the electrostatic poten- 
tial is colored from red (negative charge) to blue (positive 
charge). In the center panels regions of highest hydropho- 
bicity are colored yellow, those of lowest hydrophobicity are 
colored purple, and the gradient from yellow to white to 
purple corresponds to decreasing hydrophobicity. Shown in 
the right-hand panels are the Ca worm representations in 
the same orientation as the corresponding surfaces where 
hydrophobicity has been mapped onto the backbone worm 
with the same color scheme used for the center panels. 
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Table 1 Structural statistics^ 



R-m.s. deviations from experimental distance restraints (Ap 
All (1,241) 

interresldue sequential (/i - jl = 1) (418) 
interresidue medium range (1 < li - jl < 5) (171) 
fnterresidue long range (1 < li - jl < 5) (540) 
intraresrdue (20) 
bound water (8) 
H-bonds (84) 

R.m.s. deviations from expti dihedral restraints (°) (334)^ 
R.m.s. deviations from ^J^a coupling constants (Hz) (81)2 
R.m.s, deviations from secondary ^^c shifts (p.p.m.) 
"Ca (82) 

'KP (75) 

R.m.5. deviations from shifts (p.p-m.) (362) 
R.m.s. deviations from residual dipolar couplings (Hz) 
'D,,H (Hz) (84) 
^DcH (Hz) (77) 
iDcoc* (Hz) (44) 
'Dnc (Hz) {66} 
=^Dhnc(Hz)(63) 
Deviations from idealized covalent geometry 
bonds (A) (1519) 
angles (<>) (2,724) 
impropers (°) (775) 
iVleasures of structure quality 
Eu (kcal moM)5 
PROCHECK'' 

Residues in most favorable region of Ramachandran plot 
No. of bad contacts per 100 residues 

Coordinate precision (A)^ 

backbone (N, Ca, C\ O) 
all non-hydrogen atoms 



<SA> 


(SA)r 


0 013 + 0 001 


0.010 


0 008 ± 0 00^ 


0.005 


0.005 ± 0.002 


0.007 


0.016 ±0.002 


0,014 


0 028 + 0 011 


0.016 


0 001 -f- 0 nn/L 


0.000 


0 017*0 006 


0.009 


0 2S6 ± 0 o^y 


0.175 


0.60*0.01 


0.61 


0 85 * 0 01 


0.84 


1.17 ± 0.01 




0.25 ±0.002 


0.25 


0.50 ±0.02 


0.50 


1.12 ±0.03 


1.13 


1 26 ± 0 0 1 




0.55 ±0.01 


0.56 


1.25 ±0.01 


1.26 


U.004 ± 0.0007 


0.005 


0.599 ± 0.007 


0.757 


0.754 ±0.01 9 


0.7^4 


-434 ± 5 


-434 


85.4 ± 0.7 


87.0 


5.7 ±1.1 


5.0 


dt5±0.02 




0.45 ±0.03 





restrained regurarizatron of the n,ean structure 5a. The number of ter^l for thjvril^^^^^^^^^ 

500 kcal moH rad-' for angles and improper torsions (which serve to mairttain planarttv and chirali^) 4 kcal moM &-lir,rih^^l^^!n zl ,»i ,' 
sion term {with the van derWaals radii set to 0.8 times their value used in the CiTaS PAIW^M1S«0^ 

distance restraints (interproton distances and hydrogen bonds), 200 kcal mol- rad^forthe torsionSre*^^^ 1 kc^moH hV^ /orthf ^^'""^1^*^' 
constant restraints, 0.5 kcal mo|-' p.p.m.-^ for the secondary 'K chemical shift re^ah^ts 7 5 kcl?moH^^ 

1.0 kcal mol-' Hz^ for the 'D^ dipolar coupling restraints 10 0 035 0 050 an^ D ins k™i^^^^ " chemical shrft restraints, 

'None of the structures exhibited interproton distance violations greater than 0 5 A dihedral anole ^ationrnmVtfr t^^^^ s , 
violations greaterthan 2 Hz. The tension angle restrairm consist of lOoTayTsO^ 

(two per hydrogen bond, r«„.o = 1.5-2.8 A. r„.o = 2.4-3.5 A) were introduced during the final ste^^^ '^ZT' 
amide H-D exchange experiments, backbone three-bond couplings, 'Ka/i^CB secondan/ shite and N^iFrfaS^Tl^^^ ^^^^.^ °," 

bound water comprised three interproton distance restraints Ceh^e'en tockb^^^^^^ 

mrpr^n'^s^stptr^e^gr^^^^^^ 

^^;d™nn;~ 

spefined as the average r.m.s. difference (residues 1-101) between the final 40 sir;,ulated annealing struSandS coordinates. 



ture and topology was analyzed using the program PROIVIOTIF^'^. 
Electrostatic calculations were performed with GRASP«. Calculatioa" 
ranking and mapping of surface hydrophobic clusters was carried out 
as descrifoed2«>-22. 

Coordinates. The coordinates of the ensemble of 40 simulated 
annealing structures, the restrained regularized mean structure, and 
the complete list of experimental NMR restraints and ^H, ^^N and "C 
assignments have been deposited in the Brookhaven Protein Data 
Bank (accession codes 2EZM, 2E2N and 2E2MMR). 
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